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FOREWORD 


The  host  for  the  Thirty-Seventh  Army  Design  of  Experiments 
Conference  (DOE)  was  the  Waterways  Experiment  Station  (WES)  at 
Vicksburg,  Mississippi.  Dr.  Billy  Johnson  did  an  outstanding  job 
in  managing  all  the  local  arrangements.  Dr.  Robert  Whalen, 
Technical  Director  of  WES,  kindly  provided  a  5:00  to  7:30  PM  tour 
of  the  WES  laboratories  on  the  second  day  of  the  conference. 


This  conference  is  one  of  the  two  Army  conferences  that  the 
Mathematical  Sciences  Division,  of  the  Army  Research  Office,  holds 
for  Army  scientists.  Although  the  title  of  the  DOE  Conference 
suggests  that  it  is  concerned  only  with  design  of  experiments,  it 
is  actually  a  statistic  conference  and  covers  a  wide  range  of 
topics  in  statistics. 

An  occasional  talk  in  high  level  mathematical  statistics  or 
probability  theory  intrudes  on  the  scene,  but  generally  topics  such 
as  these,  that  are  heavily  mathematical,  are  given  in  the  Army 
Conference  on  Applied  Mathematics  and  Computer  Sciences.  The  DOE 
Conference  is  very  much  a  teaching  conference  having  clinical 
papers  with  a  panel  of  discussants,  analytic  papers  (application  of 
known  theory) ,  the  usual  technical  talks,  and  invited  papers. 

The  keynote  address  was  given  by  Professor  G.  Watson.  He  and  the 
other  invited  speakers  are  listed  below. 


Speaker  and  Affiliation  Title  of  Address 


Dr.  William  H.  Du  Mouchel 
BBN  Software  Products 

Professor  Ian  McKeague 
Florida  State  University 


Professor  Emanuel  Parzan 
Texas  A&M  University 

Professor  Isabella  Verdenelle 
Camegie-Mellon  University 

Professor  Geoffry  Watson 
Princeton  University 

Professor  Edward  Wegman 
George  Mason  University 


Bayesian  Meta  Analysis 


Identification  of  Nonlinear 
Time  Series  From  First  Order 
Cumulative  Characteristics 

Change  Analysis 


A  Bayesian  Look  at  Experimental 
Design 

The  Use  of  Simulation  in 
Statistical  Inference 

The  Straight  Scoop  on  Wavelets 
and  Nonparametric  Function 
Estimation 


in 


HZ!*  R*  Th.°®Ps°n'  Professor  of  Statistics  at  Rice  University  was 
named  recipient  of  the  tenth  U.S.  Army  Wilks  Award'  for 
Contributions  to  Statistical  Methodologies7  in  Army  Research 
Development  and  Testing.  Professor  Thompson -sinterllt  in 

with  a™*?  imporJant  Army  problems  and  his  willingness  to  interact 
with  Army  researchers  is  well  established.  His  work,  with  Dr  m  s 
Taylor,  on  data  based  nonparametric  density  estimation  arisina  •fro™ 
modeling  of  multivariate  ballistic  data  and  his  ?aith?ul  sunSort  o? 
the  Design  of  Experiments  Conference  including  presentation  of 

Emo?rfcii:  l ?T1it2  Modeling  and  SiluTat ?om  S?Sd!es^n 

Empirical  Model  Building  at  the  Thirty-Second  Conference  on  thS 

sign  of  Experiments  Conference  are  signal  contributions. 

a  tW°‘day  tutorial  precedes  these  conferences.  This  year 
the  topic  was  recent  developments  in  "Time  Series  Analysis"  Yand 
as  given  by  Professor  Joseph  Newton  from  Texas  A&M  University  He 
gave  an  excellent,  information  packed  short  courL  hYs  i/ctnre 

these ' proceedings'^ PPl *ed  ^ 


com™?f?ee  f°S.KtheSewCOnferences  is  the  Army  Mathematics  Steer inq 
Committee.  The  members  of  this  committee  are  duly  aware  of  all  the 

HIT*  n  •  9°?kS  into  makin9  these  conferences  such  memoraSe 
in  thanks  go  to  all  those  in  attendance.  The  speSerJ 
in  particular,  need  recognition  for  the  time  thev  sS  S 
preparing  and  delivering  their  scientific  papers!  Y  P 


Carl  Bates 
Eugene  Dutoit 
Douglas  Tang 
Barry  Bodt 


Program  Committee 

Robert  Burge 
Malcolm  Taylor 
Henry  Tingey 
Jock  0.  Grynovicki 


Francis  Dressel 
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Jerry  Thomas 
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Simulation  in  Statistical  Inference 


Geoffrey  Watson*,  Princeton  University 


Abstract 

In  this  paper  I  will  not  include  one  topic  discussed  in  my  actual  talk  since  the 
paper  is  already  too  long.  The  bulk  of  the  paper  describes  our  mathematical  and 
computational  studies  of  the  parametric  bootstrap.  This  is  largely  expository  -  our 
examples  are  chosen  to  increase  our  intuition  about  how  to  proceed  in  analogous 
situations.  There  are  many  uses  for  the  parametric  bootstrap  but  the  literature 
gives  little  practical  guidance  when  there  are  many  parameters.  We  hope  that  this 
paper  will  also  be  a  useful  introduction  to  the  nonparametric  bootstrap  which  was 
the  initial  problem  of  bootstrap  studies  and  is  still  their  central  interest.  I  believe 
this  wonderful  tool  is  more  subtle  than  many  think  it  is.  The  last  section  is  original 
and  shows  how,  by  converting  a  testing  problem  into  one  of  estimation,  simulation 
leads  to  the  solution  of  an  important  problem  in  paleomagnetism;  bootstrapping  is 
part  of  this  solution. 

1  Introduction 

Simulation  has  a  long  history.  “Student”  checked  his  formula  for  the  t- 
distribution  by  manual  simulation  -  drawing  numbers  on  slips  of  paper.  The  early 
volumes  of  Biometrika  are  always  fascinating  to  read;  my  undergraduate  "senior 
thesis"  was  a  summary  of  studies  of  the  effects  of  non-normality  which  I  mostly 
found  there.  Many  involved  manual  simulation.  In  the  mid-30's  Pitman  explored 
Fisher's  ideas  of  randomization  &  permutation  distributions.  He  found  their  early 
moments  by  algebraic  methods.  However  everyone  was  clear  that  one  could 
approximate  the  wanted  distributions  numerically.  But  It  was  then  very  slow  work. 

Even  when  computing  gradually  became  easier  and  cheaper,  few  of  us 
“changed  heads".  Machines  were  just  used  to  do  computations  (e.g.  numerical 
linear  algebra)  that  were  just  a  bit  bigger  than  those  we  were  already  doing. 

•Research  sponsored  by  NSF  Grant  DMS  9118  896. 


Efron  changed  everything  with  his  “bootstrap"  about  13  years  ago.  Perhaps  I 
should  have  called  this  talk  the  “Stimulation  in  Stat’l  Inference  !!!!!!!!" 

The  emphasis  in  the  bootstrap  literature  was,  and  still  is,  on  NON_PARAMETRIC 
methods  i.e.  methods  that  will  be  100%  effective  whatever  the  situation  when  the 
sample  size  tends  to  infinity.  However  an  earlier  paper  by  Efron  pointing  out  that 
the  method  of  getting  accuracies  of  maximum  likelihood  estimators  works  better 
with  sample  likelihoods  than  expectations  is  a  parametric  bootstrap  argument! 

There  is  now  a  large,  rapidly  growing  and  difficult  literature  *  the  following 
monographs  give  the  key  ideas  and  facts:  Efron(1982),  Beran  (1991),  Hall  (1991). 
But  it  is  very  hard,  even  for  an  academic,  to  “catch  up"  with  current  technology.  I 
hope  that  this  paper  will  help  applied  statisticians  to  understand  and  perhaps  use 
some  of  the  basic  ideas.  As  the  "keynote"  speaker  for  this  conference,  I  felt  it  was 
more  essential  for  me  to  address  a  topic  vital  to  everyone  than  to  talk  about 
something  on  which  I  have  the  most  expertize! 

I  drifted  into  this  area  because  I  had  to  deal  with  complicated  parametric 
estimation  problems  where  classical  theory  was  not  much  help  so  I  tried  "to 
simulate  my  way  out  of  trouble  -  ever  the  innocent  optimist!  I  was  soon  told  that  I 
was  just  rediscovering  old  bootstrap  results.  I  had  not  troubled  to  study  the 
bootstrap  before.  This  was  partly  from  laziness  and  partly  because  I  was  put  off  by 
what  I  thought  was  its  uncritical  embrace  for  small  samples  -  I  could  see  its 
asymptotic  justifications.  And  i  was  not  so  keen  on  getting  non-parametric 
methods.  I  soon  discovered  it  was  much  more  subtle  than  I  thought!  Further  I  think 
its  use  for  parametric  problems  is  not  only  very  useful  but  a  good  introduction  to 
the  use  of  the  nonparametric  bootstrap. 

Section  2  will  introduce  bootstrap  ideas  via  parametric  problems  which  are 
so  simple  that  one  can  see  what  bootstrapping  will  do  without  having  to  simulate 
at  all!  Further  we  work  with  the  m.l.  estimators.  Of  course  in  more  complex 
problems  one  must  simulate.  Section  3  shows  several  experiments  with  problems 
where  the  number  of  parameters  goes  from  medium  to  large.-  these  simulations 
were  carried  out  by  Javier  Cabrera .  In  Section  4  we  will  show  how  simulation 
may  be  used  to  solve  an  important  problem  in  palaeomagnetism.  Though  it  is 
logically  different  from  the  above,  bootstrapping  is  an  essential  part  of  the  trick! 
The  computations  shown  were  done  by  Michel  Debiche . 
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2  Simple  examples 

Example  1  Original " naive  "  bootstrap. 

Consider  a  sample  of  n  from 

f(x,0)  =0- '  exp(-x/0) 

with  arithmetic  mean  m ,  the  (unbiased)  mle  for  a.  The  “Naive-  bootstrap  way  of 
getting  a  con,,  interval  -8<c-  o,  size  pis.  I  believe ,  to  draw  N  samples  of  size  n 
from  f(x,m) .  find  the  mean  m*  o,  each ,  find  cPso  that  Npof  the  m"s  are  les  than 
cp;  we  then  have  the  “naive"  or  “percentile"  interval  0<c£ 

This  is  based  on  the  plausible  belief  that  the  distribution  of  ".Rampling 
from  f(x,6)  .Will  be  APPROX  the  same  as  that  of  m*  when  sampling  from  <(*,  ). 
fixed  m'  Now  i,  N  is  large,  our  simulation  will  give  the  same  answer  that  we  know 
“example  from  distribution  theory  -■  given  m,  mVm  disthbuted  as  X2n2/2n, 

because  we  know  that 

m/0  distributed  as  Xpn^n  .  .  . . .  , 

Write  Prob  ( X2n2  <  c2n(P»  - 1-  Hence  the  -naive’  interval  (1)  is,  when  N  is  large, 

close  to 


0  <  m  C2n(P)/2n  • 

But  directly  from  (2)  we  get  the  correct  ( if  N  large)  interval 

Prob  { m/0  >C2n(1_PV2n)  =  P’  or 
Prob  { 0<  m  .  2n/  C2n0  *P)1  *  P- 
For  n=5 ,  P=.95 


C2n(P)/2n  =  183  •  2n/C2nO*P)=2-53 

So  the  naive  interval  is  much  too  small-  its  true  coverage  is  much  less  than  the 
nominal  .95.  Hence  the  use  of  “NAIVE"! 

What  is  wrong  with  the  "naive"  argument  ? 

The  distribution  of  m* ,  given  m ,  is  NOT  the  same  as  (only  like)  the 
distribution  of  m  ,  given  0  . 
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It  is  the  distributions  of  m7m  ,  given  m  and  m/0 ,  given  0  ,  that  are  the  same! 
Notationally 

L  m7m  ,  at  m  =  L  m/0  at  0 

This  is  true  -  see  (2)  -  because  the  distribution  of  m/e  does  not  depend  on  0  i.e. 

m/0  is  a  PIVOTAL  FUNCTION . 

Pretend  now  that  we  don't  know  that  m/0  has  the  distribution  given  by  (2).  We 
can  here  approximate  it  as  closely  as  we  like  by  taking  N  large  enough  in  a 
simulation  study  of  m*/m.  Thus  we  can  get  a  an  almost  exact  confidence  interval 
by  drawing  N  samples  of  size  n  from  the  exponential  distribution  with  mean  m  and 
and  finding  k  so  that  proportion  of  mVm’s  >k  «  p,  to  good  approximation.  Then  we 
may  assert  that  it  should  be  accurate  to  say  that 

Prob{  m/0  >  k }  =  p,  or  equivalently  that  “0  <  m/k"  is  a  confidence  interval  of  size  p. 
When  N  tends  to  infinity  we  must  get  the  exact  confidence  interval  this  way. 

The  moral  of  this  is  that  one  should  use  pivotal ,  or  more  practically, 
asymptotically  pivotal,  statistics  when  bootstrapping. 

Example  2  Iteration  of  the  bootstrap , to  reduce  bias.  Prepivoting 

Consider  a  sample  of  n  from  U[0.e] ,  the  Uniform  distribution  on  [0,0].  Then 
e0®  M  =  max  t  X1 . xn)  is  the  biased  {its  expectation  is  [n(n+1 )]  0}  mle  for  0  with 

SD  0/n  ,  approx.  Can  we  improve  it  e.g.  get  a  less  biased  version,  by  using  the 

bootstrap?  I  am  pretending  we  don’t  know  all  about  this  estimator  but  I  will  use  this 
knowledge  in  place  of  large  simulations. 

Consider  the  following  idea: 

1/  Draw  N  samples  of  size  n  from  U[0,M] ,  each  time  finding  its  maximum  M*j ,  j  = 

1 .....  N ,  Compute  their  average  Ave  M* 

Method  (a ).  A  corrected-once  estimator  is 
A01  =  M  -  Bias  estimate,  Ave  M*-M 
«2M-AveM*. 

i 

since  we’d  guess  that  Ave  M*-M  would  be  close  to  the  true  bias  EM-0.  We  call 
this  the  additive  or  linear  correction  method 
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Method  (b).  Another  corrected-once  estimator  is 

ei  =  M2/Ave  M* 

This  comes  from  multiplying  M  by  the  bias  correcting  factor  M/Ave  M*.  We  call  this 
the  multiplicative  correction  method. 

In  each  case  the  subscript  “1"  is  used  because  one  can  repeat  the  process. 

2/  Draw  N  samples  from  U[0,e1] ,  find  their  maxima  M*‘  and  their  average  . 

The  twice-corrected  estimators  could  plausibly  be  chosen  to  be 
e2  =  e1  -  [Ave  M**-^]  =  20^  Ave  M** 

©2  =.0i  ©i  /  Ave  M**  =  0.2/Ave  M** 

However  analysis  shows  this  intuition  is  not  right  and  that  we  should  argue  as 
follows. 

Additive  -  The  idea  that  gave  01  was  to  assert 
L[e0-eiate  =  L  [M*-0o]at  0O  ,  suggesting 

®0"f>1  =  Ave  [  M*  -0Q] 

^  ^  A. 

°re1  »eo'lAveM*'eo^’ 

If01  is  closer  to  0  ,  similar  use  of 

Me0-e]a*e=  L[M**-e1]ate1 
should  lead  to  the  better 

e2-vI®rAveM"1,and 

®3=®0  +  l®2AveM“*1,etC‘ 

Multiplicative  -  idea  was  to  assert 

L[0o/0]at6  =  L  [M*/0o]at0o, 

or  better 

L  [0Q  /  0]  at  0  =  L[M**/01  ]at  0V 

so  use  the  latter  to  get  new  ©2  as 

e2 -e0  [e/AveWJ.and 
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e3  =  e0  [  ©2  /  Ave  M*“  ]» etc. 

We  now  show  some  properties  of  these  sequences  of  estimator  of  9. 

Analysis  If  N  is  large,  we  can  avoid  arithmetic  by  recalling  that 
EM=  0n/(n+1 ),  var  M=  e2/n2  approx 
Method  (a) 

E  ^  =  20[n/(n+1 )]  -0[n/(n+1  )]2 
^  =  0  { 1  -  (n+1  )"2 } 
vare^  varM{1+  N‘l02(l+i/n2} 

so  that  bias  is  now  slightly  down  to  0(1/n2)  while  variance  is  essentially  that  of  M 
0Q.  The  bias  in  e1  is  now  smaller  than  its  SD. 

E02  *  E  0o  +  [E  Of  -  EAve  M**] 

■EeQ  +  Ee^  1-  n/(n+1)}A 

E02  *s0{1  -1/(n+1)3}( 

an  improvement  over  E  e  1  =  {1*1  /(n+1)2}  but  not  worth  having  since  the  SDs  of  e 
«***». 

and  e2  are  0(17n). 

Method  (b) 

=  EM  E  {M2/Ave  M*  |M} 

*  Em  M2/  EM* ,  for  N  large , 

«  E  M2/[n/(n+1)]M 
=  [(n+1)/n]EM 
=  0 

so  the  BIAS  is  Zero 

var  02/n(n+2),  approx . 

E02  «Eeo  ^/Ave  M** 

=  E  eQ  E  M**,  N  large 

=:Ee0  o1/ln/(n+1)J0i. 

sI(n+1)/n]  EeQ 
=  0,  as  forei 
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Thus  the  multiplicative  correction  method  is  better  than  the  additive  but  both 
reduce  the  bias.  Why  ?  Because  it  is  based  on  a  correct  assertion  -  and  this  is  so 
because  here  M/0  is  a  pivot. 

Now  we  look  at  CONFIDENCE  INTERVALS  FOR  0  If  we  use  the  multiplicative 
approach  ,  we  expect  here  to  get  the  right  answer  because  M/0  is  a  pivot  i.e. 

L  ( M  /  0)  at  0  =  L  (M*  /  M  at  M.  (4) 

The  cdf  of  both  random  variables  is  kn  ,  0<k<1 . 

Let  qP(M*)  be  estimated  (3-quantile  of  M* ,  found  from  the  N  samples  of  size  n 
drawn  from  U(0,M)  i.e.  N|3(approx)  of  the  M*'s  are  less  than  or  equal  to  this  q. 

Belief  in  (4)  leads  us  to  assert  that 

Prob  {  M/0<qP(M*)/M }  =P  (5) 

Now  if  we  let  N  tend  to  infinity ,  for  any  fixed  M,  qp(M*)/M  tends  to  k  where  kn  =  p. 
So  that  M/0<qp(M*)/M  or  0>  M2 /  qP(M*)  gives  a  confidence  interval  with  exactly 

the  coverage  desired,  p! 

However  if,  in  the  same  problem,  we  use  the  linear  method  and  the  first  round 
of  simulations  only  ,we  would  not  expect  to  get  an  exact  confidence  interval.  In 
fact  we  would  give  as  the  one-sided  p-interval  (100p%) 

M - qp(M*- M) <0  , 

or 

0>  2M  -  qp(M*) . 

Now  as  N  tends  to  infinity ,  M  being  fixed,  we  have  just  seen  that  qp(M*)  tends 

to  k  M.  Thus  the  interval  becomes 

6/M  >2-k ,  kn  =  p,  0<k<1 . 

The  actual  coverage  of  this  interval  is  easily  seen  to  be 

(2-k)’n  *  (2-  pi/n)'n  , 

which  is  tabled  below 


p=.9 

P=.95 

P=.99 

n=4 

.90244 

.95062 

.99002 

8 

.90123 

.95031 

.99001 

10 

.90099 

.95025 

.99001 

16 

.90062 

.95016 

.99000 

20 

.90050 

.95012 

.99000 

Although  we  have  made 

intervals  from 

a  nonpivital  function,  the  coverages 

are  very  accurate.  We  may  not  be  so  lucky  in  other  cases. 
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Iterative  bias  correction  is  worthwhile  here  because  the  variance  is  so  small! 
Remark  1 

This  is  an  example  of  non-regular  m.l.  Thus  the  variance  of  the  estimator 
cannot  be  obtained  from  the  likelihood  and  the  limiting  distribution  is  non-normal. 
In  fact  the  studentized  form 

t  =  [o-MJ/JM/n]  is  exactly  pivotal  and  has  a  standard  exponential  distribution  when 
n  tends  to  infinity.  But  to  do  this  one  needs  to  know  the  formula  for  the  SD  of  M.  In 
the  regular  case  one  could  use  the  negative  derivative  of  the  likelihood  to  get  the 
correct  divisor. 

To  proceed  this  way  here  one  would  have  to  use  a  double  bootstrap  -  in 
effect  to  find  the  variance  formula. 

i 

Remark  2 

Had  we  studied  the  estimation  of  p  from  a  sample  of  n  from  N(p,1) ,  of  course 
the  additive  method  would  be  best  since  sample  mean  -  p  =  m-p  is  then  pivotal.  If 

we  have  N(p,B2)  jts  distribution  depends  on  B,  and  Student’s  t  is  a  pivot  now.  See 
Example  3. 

Remark  3 

For  FINITE  Simulation  size  N,  we  keep  adding  a  little  variance  and  so  mess 
things  up  a  bit 
Remark  4 

CAN  WE  FIX  UP  A  NONPIVOTAL  FUNCTION?  e.g.  0-M  in  EX  2? 

Beran  (1988)  suggests  the  use  of  an  idea  he  calls  PREPIVOTING. 

He  calls  the  function  of  estimator  and  parameter  that  we  use  (e.g.  0-M ,  M/0 ) 
a  “root”.  In  general  call  it  r(0, 0). 

Recall  that  if  rv  X  has  cdf  (continuous)  F(x),  F(X)  is  uniformly  distributed  on 
[0.1]  so  Prob  { F(X)<a} «  a. 

a, 

Suppose  in  some  general  problem  we  knew  the  cdf  of  the  root  r(0, 0)  to  be  H 
(r,  0).  Then 

H  [  r(0, 0).  0]  is  U[0,1] 

so  that  the  set  { 0:  H  [  r(0, 0),  0]  <  a}  has  probability  a .  (6) 


The  trick  then  is  to  find  or  approximate  the  cdf  H.  Then  (6)  should  give  a 
confidence  interval  of  the  right  coverage;  note  that  theoretically  there  are  many 
intervals  with  the  right  coverage,  some  more  sensible  than  others. 
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One  needs  two  levels  of  sampling  i.e.  a  double  (not  iterated)  bootstrap. 

First  try  it  in  the  Uniform  of  Example  2.  Use  sampling  from  U  (0,  M)  to  find 
approximately  the  cdf  of  the  root  M-M*.  Call  it  H(k,M).  One  might  hope  that 

H(0-M,M)  approx  U(0,1)  so  that 
{0:  H(0-M,M)<a)  =  Conf  Int.  approx  size  a  . 

If  N  is  very  large  ,  H  will  be  what  we  get  from  distribution  theory:  1-(1-k/M)n. 

Then  Beran's  interval  would  be 

{0:  1-[1-(0-M)/M]n  <  a} 

or 

{0:  0/M<2-(1-a)1/n. 

But  the  probability  of  this  last  statement  being  true  is  was  tabulated  above  -  it 
was  surprisingly  close  to  a. 

Thus  here  Prepivoting  has  converted  a  nonpivotal  function  0-M  into  a  pivotal 
one  M/0  and  lead  to  almost  exact  confidence  intervals.  With  finite  samples  we 

should  approximate  this  happy  state. 

More  Generally 

Draw  N  [1]  samples  from  f(x,  0),  obtaining  N  [1]  estimates  which  we  will 
suppose  when  ordered  from  least  to  largest  to  be 

o‘1, ... ,  0*  n  [1]  * 

Now  draw  N  [2]  samples  from  the  density  f(  x,  0*j),  computing  each  time  the 
estimator  0**  and  build  their  empirical  distribution.  Call  it^H( . ,  0*j ).  With  this  done 

for  all  N[1]  values  of  i,  we  compute  the  N[1]  values  of  H  (0, 0*i). 

If  H(0,  0*j)  <  a,  accept  0*j ;  if  not,  reject  it.  If  the  simulation  sizes  are  large 
enough,  these  values  should  give  a  simple  conf.  int.  for  0,  obtained  by  a  double 
layer  of  sampling  !! 

Notice  how  the  arithmetic  can  get  out  of  hand  here-  we  need 
0(N2)  samples 

Example  3 

Sample  of  n  from  N(p,l32),  mean  m,  variance  s2.  To  find  an  interval  for  p. 
Suppose  we  don't  know  how  to  studentize  and  we  start  with  the  root 
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n1/2[m-p] . 

By  extensive  sampling  from  N(m,s2)  we  will  find  that  the  cdf  of  the  root 

n1/2Im  's  where  I  is  the  cdf  of  the  standard  normal.  Hence  Beran’s 
method  says:  use  confidence  set 

{4:  T(  n1/2[m-p]  /s)  <  a } 

Thus  his  method  has  again  studentized  ( made  pivotal )  the  initial  root.  Moreover 
the  actual  coverage  of  the  interval  is  only  slightly  wrong  -  since  it  uses  the 
standard  normal  rather  than  the  t-distribution. 

Abstracting  the  ideas  so  far  - 

We  have  a  statistic  T=  A0o ,  computed  from  a  sample  of  n  from  f(x ,  0  )  and 
want  a  point  estimator  of  0  with  little  bias  and  also  perhaps  a  confidence  interval 
for  0.  In  general  we  won’t  know  ET  as  a  function  of  0  but  we  assume  T  is  a 
reasonable  estimator  so 

ET  =  0+b(o)  or0(1+B(0)) . 

(1 )  Draw  N  samples  of  n  from  f(x  ,*00),  get  N  estimates  0*  and  their  average 
Aveo\ 

Do  your  best  to  find  a  function  g(A0,o)  which  is  pivotal  or  whose  distribution 
changes  slowly  with  0  .  Then  assume 
Lg(A0.0)  =  Lg(0*,A0) 

Solve  g(A0,0)  =  Ave  g(o*,A0)  for  0  and  call  the  solution  A0-j  .  Above  we  used 
g(x,y)  =  x-y  andx/y. 

(2)  Repeat  with  the  same  plan  ,  solving 
g(A0,0)  *  Ave  g(0**,A0f ) 
to  get  a02  ,  and  so  on . 

This  gives  a  sequence  of  point  estimators  whose  bias  should  go  down.  It  is  only 
worth  continuing  if  the  bias  reduction  at  least  size  of  SDIII! 

To  get  confidence  intervals  when  we  are  happy  with  the  additive  assertions 
L  [  A0o  -  0]  *  L  [  0*  -  A0o  ]  f  or  maybe 

L  [A0O  -  0]  *  L  [  0**  -  A0i  ],  we  could  e.g.  find  the  97.5%  quantile  of  the  N  values 
of  0*  -  A0o.  Call  it  q*(.975). 

Then  we  have  the  approximate  statement 

Prob{  A0o  -  0  <  q*(.975) }  *  .975  so  a  95%  confidence  interval  would  be  { Aoo- 
q  (.975),  A0o*  q*(.025)}.  The  naive  method  would  give 
(q‘(.025) ,  q‘(.975)} . 

The  multiplicative  case  goes  similarly  . 
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We  can  use  prepivoting  to  get  a  more  reliable  confidence  interval  for  0,  use  the 
step  (1 )  results  to  get  the  cdf  of  g(0*,A  0).  H(.,  0)  say.  Then  propose  the  set  {0: 
H[g(0,  0)]<  a}  as  a  confidence  set  for  0  of  size  a.  Or  we  could  use  a  double 
bootstrap  to  adjust  empirically  the  intervals  to  get  the  correct  coverage. 

Remark  4.  There  might  be  occasions  when  you'd  want  to  estimate  a  known 
function  of  0.  Also  when  you  might  want  to  use  Median  0*  instead  of  Ave  0*  to  get 
median  unbiasedness.  We  won’t  pursue  these  directions  here  -  they  pose  no  new 
problems. 

Remark  5  Rather  a  lot  of  computation  is  required,  especially  in  double 
bootstrapping.  There  is  a  literature  on  methods  for  reducing  it  by  monte  carlo 
methods  e.g.  importance  sampling.  Redoing  whole  calculations  many  times,  to 
verify  a  method,  is  sound  are  particularly  arduous. 


3  Distributions  with  several  parameters 

The  number  of  strategies  for  any  problem  increase  with  the  dimension  of  the 
parameter.  We  will  here  only  show  some  experiments  on  iterative  simulation  to 
reduce  bias.  In  almost  all  high  dimensional  problems,  we  can  at  most  find 
asymptotic  pivots.  These  are  usually  a  consequence  of  calling  upon  the  central 
limit  theorem.  I  think  there  may  be  cases  where  this  strategy  leading  to 
nonparametric  results  may  be  inferior  in  small  samples  to  using  some  statistic 
related  to  a  parental  distribution,  when  that  assumption  is  not  wildly  wrong.  This  is 
a  matter  for  future  research.  This  work  was  done  with  Javier  Cabrera . 

Example  4 

Efron's  papers  (and  others)  often  refer  to  the  problem  of  estimating  the 
correlation  coefficient  ®  from  a  sample  of  n  from  a  bivariate  normal.  We  try  to  see 
if  iteratve  simulation  decreases  the  bias. 

We  used  1000  samples  of  5  from  a  bivariate  normal,  both  means  zero,  unit 
variances  and  ®  =  0.7.  Each  pair  of  lines  in  Table  1  refers  to  a  corrected  set  of 
estimates  so  we  have  the  zero*th  to  fifth  correction.  The  first  line  each  time  is  the 
mean  of  all  1000  estimates.  The  second  line  the  standard  deviation  of  these  1000 
estimates.  Thus  e.g.  for  the  means,  the  SD  should  be  1/51/2  =  .447,  about  the 
value  in  Table  1 .  Evaluating  the  square  root  of  the  large  sample  variance  var  r  * 
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(1-®2)2/n,  we  find  0.228,  a  little  less  than  the  0.33  in  our  table.  The  divisor  n-3 
here  gives  .361. 

Following  down  the  columns  we  see  that  the  mean  estimates  don’t  change  as 

we  would  expect,  that  the  standard  deviations  get  a  little  closer  to  unity  and  that 
the  average  r*s  approach  .7. 

But  motion  stops  after  the  first  iteration. 

Remarks  Better  results  would  come  if  we  used  the  Fisher  transformation  of  r, 

2  ■  (1/2)  log(1+r)/(1-r) 
ft  =  (1/2)  log(1+<S>)/(1-®), 

since  as  nu  Q  ,  z-fl  becomes  standard  normal  and  so  is  then  pivotal. 

Efron  and  others  have  shown  that  a  double  bootstrap  gives  a  confidence 
interval  with  good  coverage. 

Here  we  should  probably  not  have  bothered  to  iterate  for  the  means  and 

variances.  Notice  that  there  are  here  5  parameters  only  one  of  which  ®  was  of 
interest. 


Example  5 

We  now  turn  to  one  of  the  two  problems  that  lead  us  to  this  way  of  thinking. 


Suppose  we  have  samples  of  nj  from  Fisher  distributions  F(pj,*j) ,  i  =  1,2  ,  on 
the  unit  sphere  ||x||  =1  in  three  dimensions.  So  W||  =  i  where  w  is  the  mean  ’ 
direction  and  axis  of  rotational  symmetry  of  the  distribution  and  *j  controls  its 
concentration.  With  x.p  *  cos  6, 

Fisher  prob  density  *  {1/4  jt  sinh  *}  exp  { x  x.p } 

The  estimates  are  mj ,  the  directions  of  the  sum  Rj  of  the  observed  unit  vectors  & 

kj  =  (nj-1)/(n-||  Rj||). 

These  x-estimators  have  distributions  skewed  to  the  right  and  are  not  unbiased. 
We  really  want  to  estimate  the  angle  0  between  the  means 

Pi  and  42  • 

This  is  naturally  estimated  from 


m1  m2  *  cos  0o 

but  ft  is  dear  that  this  will  lead  to  an  overestimate  which  can  be  very  severely 
biased.  So  we  want  to  take  the  bias  out  of  the  estimates  of  0  and  xi  and  y?  to  the 
extent  that  this  is  possible.  To  use  our  method  we  must  draw  N  samples  of  n,  and 
n2  from  Fishers  with  concentrations  k,  and  k2  and  whose  mean  vectors  are 
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separated  by  angle  eQ.  We  did  separate  runs  using  all  additive  and  all 
multiplicative  methods  but  saw  no  obvious  differences.  We  don’t  know  any  simple 

but  more  pivotal  quantities  that  we  could  have  used.  We  used  many 

combinations  of  x=X1=X2  and  n=n1=n2  and  0  to  study  point  estimation.  But  only 
in  a  few  did  we  calculate  the  actual  coverage  of  our  95%  nominal  intervals.  See 
Table  2  for  some  results. 

Morals  L  .  .  . 

Over  the  range  of  sample  sizes  and  x’s  tried  so  far,  the  methods  are  not  good 

*  correction  dwarfed  by  sd 

*  poor  coverages  from  percentile  intervals  so  need  to  use  better 
method  ( e.g.  double  sampling  )  to  get  Cl’s. 

Competitive  methods  -  see  e.g.  Debiche  &  Watson,  (1991)  -  also  have  trouble. 

The  problem  is  inherently  hard. 

Example  5 

Our  other  motivation  was  a  set  of  functional  relationship  problems.  Three 
problems  have  come  up  from  my  geophysical  contacts. 

(1)  the  estimation  of  a  linear  transformation  with  positive  determinant  (Gleser  and 
Watson,  1973)  from  the  initial  and  final  positions  of  n  points,  measured  with  errors 
with  known  covariance  matrices 

(2)  radioactive  dating  methods  lead  to  fitting  a  linear  relation  between  three 
variables  subject  to  errors  with  known,  possibly  unequal,  covariance  matrices 
(Kent ,  Watson  ,  Onstott.1990).  The  method  given  there  works  in  any  number  of 

dimensions. 

(3)  The  motion  of  sea  ice  deduced  from  the  initial  and  final  positions  of  n  radio 
beacons  on  the  ice  whose  positions  are  measured  with  planar  errors  by  satellite. 
One  could  possibly  assume  one  knew  the  covariance  matrix  of  the  errors.  Here 
then  we  are  estimating  a  rigid  motion  -  displacement  and  rotation. 

In  Problem  2  above,  the  actual  errors  were  small  and  our  mathematical  method 
lead  to  standard  errors  that  agreed  very  well  with  those  obtained  by  John  Kent 
who  kindly  used  our  current  method  at  our  request  on  these  data  sets.  Problems  1 

&3  are  yet  to  be  attacked. 

Instead  we  tried  the  simple  but  classical  problem  of  fitting  a  straight  line  with 
both  variables  subject  to  normal  errors.  Here  if  there  are  n  points  with  the  same  B's 
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in  each  coortJiH^e  and  the  true  points  fall  on  n  =  a  +  p  C.  we  have  n  ( the  C’s)  plus 
(a  &  p)  plus  1  (B).  i.e.  n+3  parameters  from  2n  observations. 

. ,n  0ur  exPeriments  we  also  included  the  method  (denoted  in  our  table  by  - 
mle")  recommended  e.g.  in  Fuller  (1987). 

In  problems  1&3  the  mle  type  methods  are  complicated  so  one  might  want  to  use 
simpler  least  squares  type  methods.  So  we  tried  here  to  start  from  a  least  squares 
type  method,  re-estimating  positions  of  the  points  by  their  orthogonal  projections 
onto  the  currently  fitted  line,  refitting  the  line,  getting  a  new  ‘8  and  points,  &  so  on. 

Our  results  (Table  3)  are  confusing ,  probably  because  we  have  so  many 
parameters  to  estimate  in  each  round  and  again  we  have 
no  pivots. 

Doubt  now  whether  one  could  EXTRAPOLATE  from  what  was  learnt  herel 

The  one  clear  result  was  that  in  many  practical  situations  the  mle's  will  not 
behave  as  advertised.  That  this  is  so  is  seen  from  Figures  1  and  2.  We  am  now 
trying  to  see  if  we  can  “fix  up"  the  mle’s!!! 

This,  I  suspect,  will  in  fact  be  the  major  practical  use  of  parametric 
bootstrapping. 


4  A  simulation  solution  to  the  “Fold  Test  Problem”. 

A  simplified  version  of  a  classical  problem  on  paleomagnetism  is  the 

following.  Imagine  that  an  eruption  spreads  hot  lava  across  a  plane.  When  the 

iron-nch  lava  cools  below  its  Curie  point,  it  will  acquire  magnetization  parallel  to 
the  local  earth’s  field  at  that  time.  If  the  slab  is  not  too  large  all  parts  of  the  slab  are 
parallely  magnetized.  We  imagine  that  no  subsequent  events  alter  this 
magnetization.  Suppose  that  it  is  subsequently  folded.  Then  if  we  go  to  different 
srtes  on  this  folded  slab  and  measure  this  frozen  or  remnant  magnetization,  we  will 
not  get  parallel  vectors.  However  the  angle  between  the  normal  to  the  bedding 
(old  horizontal)  plane  and  the  magnetization  should  be  preserved.  If  however  the 
magnetization  of  a  folded  formation  was  acquired  somehow  after  folding  the 
magnetization  in  the  folded  formation  would  be  parallel  at  all  sites.  The  other 
alternative  is  that  the  magnetization  was  acquired  at  some  time  during  the  folding 

zz  *“"*“  ■ -  ■ -•  ~  ■ £ 

For  some  40  years  people  have  tried  to  sort  this  out  with  significance  tests  It 
has,  since  1953,  been  standard  to  describe  the  scatter  of  paleomagnetMrSion 
measurements  by  the  Fisher  distribution  mentioned  in  the  last  section.  Nowadays 
bootstrap  methods  are  being  applied  to  this  problem  -  see  eg.  Fisher  &  Hall 
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(1991 ).  It  is  possible  to  produce  tests  that  a  set  of  mean  vectors  come  from 
distributions  with  parallel  mean  directions  (null  hypothesis:  Post).  And  it  is 
possible  to  test  that  a  set  of  angles  is  constant  (null  hypothesis:  Pre).  But  one  can 
get  into  a  logical  muddle.  Many  references  could  be  given  -  but  see  e.g.  Tauxe, 
Klystra,  and  Constable  (1991). 

We  prefer  to  think  about  this  differently  -  as  an  attempt  to  estimate  the  relative 
timing  of  magnetization  and  folding.  Let  us  assume  -  as  is  very  common  here  - 
that  the  axis  of  folding  is  known.  If  it  is  known  only  with  error  this  may  be  coped 
with  by  further  layer  of  simulation.  Then  the  current  folding  can  be  unfolded  by  any 
amount  until  we  get  back  to  the  original  which  is  100%  unfolding.  The  present  is 
0%  unfolding.  Thus  we  seek  the  %  unfolding  when  the  rock  was  magnetized. 

Then  the  measured  unit  vectors  will  be  more  parallel  than  at  other  %'s. 

A  more  complicated  approached  should  be  used  in  practice  but  the  following 
will  give  our  idea.  The  basic  data  will  be  one  sample  of  n-,  unit  vectors  (the 
directions  of  magnetization  of  specimens)  at  site  i  with  mean  direction  mj  and 

concentration  k„  with  i  =  1 . s.  Now  the  mean  directions  could  be  unfolded  (i.e. 

rotated  about  the  fold  axis,  and  at  each  percent  a  “kappa"  estimate  based  on  these 
rotated  means  could  be  found  and  plotted  against  the  %.  If  the  curve  had  a 
maximum  at  say,  75%,  we  would  want  to  conclude  that  the  magnetization  occurred 
at  that  %.  But  what  is  the  variability  of  that  %  estimate?  One  suggestion,  which 
we  will  illustrate  below,  is  to  assume  that  the  data  at  the  i-th  site  are  distributed  as 
F(  mj  ,kj),  draw  a  sample  of  nj  from  this  and  compute  a  new  mean  direction  m*j 

say,  for  i  *  i .....  s  and  so  a  new  curve.  Then  go  on  till  one  has  N  curves.  Then 
one  can  get  the  statistics  of  the  maxima  -  position,  height,  etc.  The  percentiles  of 
the  positions  gives  us  a  confidence  interval  for  the  position  of  the  maxima.  If  the 
interval  overlaps  100%  (0%)  one  cannot  rule  out  the  “pre”  (“post")  hypothesis.  In 
our  worked  example,  the  data  used  was  supplied  by  K  Kodama.  We  show  the  first 
20  curves  Fig.  2.  From  the  statistics  of  the  results  (see  Figures  3, 4,  5),  we  found 
that  95%  of  all  positions  of  the  maximum  lay  in  the  interval  (65% ,  80%).  The 
practical  inference  in  this  case  is  clear.  No  doubt  in  other  cases  there  might  be  so 
much  noise  that  no  clear  -  cut  assertion  could  be  made.  These  computations  were 
done  for  me  by  Michel  Debiche. 

If  the  fold  axis  is  measured  with  some  known  error  it  would  be  easy  to  add 
another  layer  of  simulation  to  reflect  it  -  because  one  will  get  different  mj  at  the 

same  %  unfolding.  This  will  broaden  a  peak  such  as  we  gave  in  Fig  2  -  and  so 
broaden  the  confidence  intervals. 
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Here  we  got  our  resamples  from  Fisher  distributions.  Instead  we  could  draw 
bootstrap  samples  from  the  points  in  each  original  sample,  find  their  mean 
directions,  rotate  each  by  each  %  unfolding,  calculating  Kappa  each  time.  This  too 
gives  a  family  of  graphs  which  would  be  treated  as  before. 

Finally  I  would  prefer  to  use  not  kappa  but  a  statistic  I  have  devised  (Watson, 
1983)  to  check  the  null  hypothesis  that  a  number  of  populations  have  the  same 
mean  direction.  The  use  of  kappa  here  goes  back  to  early  but  incorrect 
paleomagnetic  practice  -  McElhinney  (1964). 

These  two  methods  are  I  suppose  the  applications  of  the  parametric  and 
nonparametric  bootstrap  to  this  probleml  I'm  sure  that  they  will  get  refined  and 
extended  when  they  are  used  in  practice. 
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TABLE  1 


n  «  5 


Mean 

SD 

Mean 

SD 

Mean 

SD 

Mean 

SD 

Mean 

SD 

Meax: 

SD 


BIVARIATE  NORMAL 


10000  simulations  1000  replications  5  iterations 

mul~0  mu2~0  sigmal«l  sigma2«l  ro*0.7 

-0.021000  -0.019741  0.929537  0.925653  0.655325 
0.448826  0.438744  0.352741  0.345771  0.330464 

•0.020863  -0.019644  0.988923  0.984850  0.689007 
0.448893  0.438763  0.375278  0.367848  0.340013 

•0.021003  -0.019646  0.989047  0.984776  0.686186 
0.449051  0.438793  0.375146  0.367873  0.337448 

0.020925  -0.019757  0.988979  0.984864  0.687068 
0.448873  0.438851  0.375570  0.368098  0.337839 

0.020738  -0.019612  0.988741  0.984708  0.686768 
0.448984  0.439047  0.375188  0.367905  0.337750 

0.021048  -0.019722  0.988989  0.984861  0.686793 
0.449041  0.438764  0.375357  0.368018  0.337718 
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TABLE  2 


USING  flULTIPLICATIUE  CORRECTION  FOR  THE  RNGLE 
Samp I e  size  -  5.  q /^ev^Kappc*  Std.Dev. 

Unconnected  Estimates  21.92  no  M  56 

Corrected  Estimates  20.32  8.664  17.41  Jl.JJ 

Corrected  Tw ice  19.11  9.116  20.28  13. ov 

Linear  Correction  16.36  10.654  19. 

Conf i d.nee  .nUrvol.^  ^  , 

Corrected  Twice  1.688  3.025  19.34  33.86  36.6 
Linear  Correction  0.000  0.000  16.77  33.43  36.3 


26.94 

17.94 
20.99 
20.  17 


18.29 

12.28 

14.37 

13.79 


Sample  size  •  20, 


True  Parameter  » 
Angle  Std.Dcv. 


6.017 

6.348 

6.597 

7.256 


5,  Kappa  -  5 
Kappa  1  Std.Dev 


5.310 

5.012 

5.024 

4.961 


1.279 

1.214 

1.218 

1.868 


Kappa2  Std.Dev 


Uncorrected  Estimates  11.770 
Corrected  Estimates  9.139 
Corrected  Teice  7.947 
Linear  Correction  5.784 
Confidence  Intervals^  ^  5f  50„  g;|  ,7  5 

r  - _ j  Tmiee  0.08285  0.2718  6.376  20.50  22.92 
Linear  Correction  0.00000  0.0000  1.773  19.74  22.84 


5.291 

4.993 

5.006 

5.0d4 


1.226 

1.163 

1.169 

1.265. 


Sample  size  *  20, 


Uncorrected  Estimates  20.19 
Corrected  Estimates  19.77 
Corrected  Twice  19.71 
Linear  Correction  19.52 


True  Parameter  •  20, 

Angle  Std.Dev.  Kappa  1 
4.097  21.23 

4.195  20.06 

4.223  20.11 

4.338  20.16 


Kappa  -20 

Std  Dev.  Kappa2  Std.Dev. 

21.15  4.904 

19.98  4.635 

20.04  4.654 

20.00  4.821 


5.119 

4.837 

4.851 

6.314 


Confidence  Intervals 

2.5* 

Corrected  Twice  10.96 
Linear  Correction  10.80 


5* 

12.64 

12.32 


50*  9r  97.5 
19.80  2w  <.8  27.87 
19.67  26.33  27.68 


Sample  size  «  5, 


True  Parameter  ■ 
Angle  Std.Dev. 


14.82 
15.95 
16.32 
16. 16 


Jneorreeted  Estimates  29.22 
Corrected  Estimates  24.62 
Corrected  Twice  20.71 
Linear  Correction  18.49 
Confidence  Intervals  ^ 

Corrected  Twice  0.1962  0.6265866 
Linear  Correction  0.0000  0.0004011 


20,  Kappa  ■  5 
Kappa  1  Std.Dev. 


6.523 

4.342 

5.018 

4.842 


4.381 

2.902 

3.464 

3.302 


Kappa2 

6.740 

4.476 

5.202 

5.011 


Std 


.Dev. 

1.581 

3.083 

3.639 

3.482 


5* 


50*  95*  97.5 
17.90  50.36  57.43 
14.66  48.61  56. 17 


Semple  size  »  20, 


True  Parameter  •  20,  Kappa  «  5 

S“.Dr..  'WJ1  SW,°?7i  -  226 


8.373 

8.969 

9.389 

10.299 


5.310 

5.012 

5.024 

4.961 


Jneorreeted  Estimates  21.54 
Corrected  Estimates  19.65 
Corrected  Twice  18.97 
Linear  Correction  17.81 
Confidence  Intervals 

Corrected  Twice  L  1 12  2.876  1jT5|  34-45  37.12 
Linear  Correction  0.000  0.000  18.52  34.25  37.12 


1.279 

1.214 

1.218 

1.868 


5.291 

4.993 

5.006 

5.004 


1.226 

1.163 

1.169 

1.265 


50*  95*  97.5 
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TABLE  3:  a  =  2  p  =  1 


FUNCTIONAL  RELATIONSI  IIP 


Sample 


size  •  20 


Ml.F. 

1 .98B9R 

1.02032 

0.09189 

1.99547 

1 .00855 

0.09  .75 

Si  <1  .l>cv . 

0.11450 

0.16878 

0.03863 

0.062VJ 

0 . 1  0  !>  0  <1 

0.0.  til  II 

Uncorrected 

2.01377 

0.97090 

0.09334 

2.04155 

0.91647 

0.10109 

Std.Dov. 

0.10783 

0.17564 

0.04019 

0.05566 

0.09510 

0.01799 

o*0.l 

Corrected  Once 

1.99854 

1.00120 

0.10001 

2.00642 

0.98865 

0.10061 

Std.Dev. 

0.11137 

0.18221 

0.04242 

0.06027 

0.10419 

0.01748 

Corrected  Twice 

1.99426 

1.00979 

0.09958 

1.99960 

1.00029 

0.10025 

Std.Dev. 

0.11281 

0.18521 

0.04186 

0.06170 

0.10716 

0.01724 

Linear  Correction 

1.99063 

1.01455 

0.09973 

1.99831 

1.00282 

0.10044 

Std.Dev. 

0.11547 

0.19639 

0.04202 

0.06217 

0.10815 

0.01732 

MLE 

1.80990 

1.32309 

0.26485 

1.93024 

1.13252 

0.29255 

t  Std.Dev. 

3.18758 

5.11863 

0.10950 

0.63185 

1.06141 

0.04966 

Uncorrected 

2.13501 

0.72845 

0.29520 

2.23287 

0.53413 

0.32352 

Std.Dev. 

0.28726 

0.45909 

0.13687 

0.12044 

0.18987 

0.06142 

o=0.3 

Corrected  Once 

2.06939 

0.85938 

0.32011 

2.14881 

0.70196 

0.32183 

Std.Dev. 

0.33261 

0.54902 

0.15849 

0.14471 

0.24019 

0.06334 

m 

9  Corrected  Twice 

2.02968 

0.93864 

0.31434 

2.11198 

0.77549 

0.31594 

Std.Dev. 

0.37227 

0.62756 

0.15627 

0.15884 

0.26922 

0.06196 

Linear  Correction 

1.94891 

1.06672 

0.31732 

2.07814 

0.84229 

0.32037 

Std.Dev. 

1.61161 

5.84560 

0.15687 

0.17893 

0.31108 

0 .06361 

MLE 

,  Std.Dev. 

1.4838 

1.9459 

0.40760 

2.03963 

0.8823 

0.47362 

18.4917 

32.5957 

0.16660 

6.57566 

14.5629 

0.07734 

Uncorrected 

2.2811 

0.4378 

0.47946 

2.35761 

0.2654 

0.53149 

Std.Dev. 

0.4025 

0.6011 

0.21989 

0.16119 

0.2263 

0.09648 

Corrected  Once 

2.2299 

0.5417 

0.53285 

2.30134 

0.3978 

0.53822 

o*0.5 

Std.Dev. 

0.4800 

0.7577 

0.26366 

0.19732 

0.3121 

0.10159 

Corrected  Twice 

2.1963 

0.6102 

0.52486 

2.27339 

0.4535 

0.53221 

Std.Dev. 

0.5440 

0.8838 

0.26271 

0.21830 

0.3582 

0.10164 

Linear  Correction 

1.9099 

0.8702 

0.52921 

2.23999 

0.5182 

0.53648 

Std.Dev. 

5.7236 

2.7234 

0.26260 

0.30512 

0.4264 

0.10122 

MLE 

2.3755 

1.4762 

0.73618 

2.23275 

-0.0011 

0.89460 

Std.Dev. 

19.9005 

55.9057 

0.30012 

11.81564 

36.7087 

0.14062 

Uncorrected 

2.4190 

0.1374 

0.67690 

2.460S2 

0.0823 

1.00262 

Std.Dev. 

0.6312 

0.6792 

0.39050 

0.25924 

0.2448 

0.17349 

0*1.0 

Corrected  Once 

2.4017 

0.1676 

0.98398 

2.44374 

0.1166 

1.02264 

Std.Dev. 

0.7367 

0.8874 

0.47236 

0.29379 

0.3529 

0.18170 

Corrected  Twice 

2.3911 

0.1867 

0.97236 

2.43466 

0.1341 

1.01557 

Std.Dev. 

0.6322 

1.0663 

0.47299 

0.31635 

0.4126 

0.18200 

Linear  Correction 

2.3126 

0.0173 

0.97908 

2.42273 

0.1597 

1.02004 

Std.Dev. 

6.3744 

11.6193 

0.47204 

0.47741 

0.5131 

0.18116 

MLE 

5.500 

>1.566 

1.39434 

3.12274 

-0.9583 

1.73817 

Std.Dev. 

181.505 

96.599 

0.57535 

27.06611 

73.2686 

0.27413 

Uncorrected 

2.4 17 

0.042 

1.66592 

2.49416 

0.0168 

1.94763 

Std.Dev. 

1.135 

0.690 

0.74930 

0.47249 

0.2464 

0.33696 

Corrected  Once 

2.458 

0.051 

1.87017 

2.48995 

0.0236 

1.98775 

0*2.0 

Std.Dev. 

1.280 

0.696 

0.90436 

0.50090 

0.3565 

0.35256 

Corrected  Twice 

2.457 

0.057 

1.84784 

2.48762 

0.0270 

1.97479 

Std.Dev. 

1.413 

1.067 

0.90460 

0.52055 

0.4173 

0.35310 

Linear  Correction 

1.959 

0.633 

1.86108 

2.48089 

0.0344 

1.98286 

Std.Dev. 

48.055 

33.013 

0.90312 

0.76009 

0.5600 

0.35160 

20 


20,  C.S 


23 


Per  cent  rotation 
sites  *  8:  no.  limbs  • 


FIGURE  4 

Histogram  of  maximum  k  values 


h  8 


L  8 


24 


FIGURE  5 

Histogram  of  %  rotation  for  maximum  k 
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Abstract 

Structures  which  fail  due  to  cyclic  loading  are  said  to  fail  in  fatigue. 
Damage  accumulation  due  to  fatigue  is  the  primary  factor  which  limits 
the  useful  life  of  aircraft. 

Designers  of  new  aircraft  obtain  information  with  which  to  estimate 
fatigue  life  by  extensive  testing  of  small  specimens,  in  addition  to  very 
limited  testing  of  actual  structural  components.  Based  on  this  data,  an 
estimate  is  made  of  the  lifetime  for  which  very  high  (.999999)  reliability  is 
assured.  This  high  reliability  is  currently  a  requirement  in  the  construction 
of  army  helicopters  and  fixed  wing  aircraft. 

There  is  little  agreement  among  designers  on  how  fatigue  life  should 
be  determined,  as  well  as  insufficient  understanding  of  the  uncertainties 
involved  in  high  reliability  computations. 

This  presentation  reviews  the  fatigue  life  determination  procedures  for 
several  manufacturers  and  points  out  some  ways  in  which  these  methods 
are  deficient  in  obtaining  high  reliability. 

The  purpose  in  introducing  this  clinical  paper  is  to  obtain  statistical 
procedures  that  will  provide  highly  reliable  fatigue  loaded  structures  such 
as  the  Army  helicopter. 
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INTRODUCTION 


Methodology  to  substantiate  helicopter  fatigue  life  has  received  considerable  attention  dur- 
mg  the  last  decade.  This  interest  was  stimulated  by  the  substantial  variability  in  the  results 
from  the  study  on  the  American  Helicopter  Society  pitch  link  problem.1  Recently,  further 
interest  has  resulted  from  the  U.S.  Army’s  introduction  of  a  structural  fatigue  reliability  crite¬ 
rion  for  rotorcraft.  This  criterion  has  been  interpreted2  as  a  requirement  for  a  component 
lifetime  estimate  to  have  a  reliability  of  0.999999.  F 

Helicopter  safe  life  reliability  methodology  has  recently  been  the  subject  of  several  papers3-6 
and  an  American  Helicopter  Society  subcommittee  round  robin.7 

The  authors8  have  investigated  the  sensitivity  of  high  reliability  estimates  from  simple 
stress-strength  statistical  model  computations.  Results  showed  substantial  variability  in  reliabil¬ 
ity  estimates  even  for  almost  undetectable  differences  in  the  assumed  probability  density  func¬ 
tions  (PDFs)  representing  the  stress  and  strength  data. 


In  this  report,  the  uncertainties  in  determining  high  reliability  for  helicopter  compo¬ 
nent  safe  life  design  are  studied  by  introducing  a  simulation  process  to  identify  the  effects 
of  a  small  amount  of  variability  in  the  design  variables  for  determining  the  lifetime  esti¬ 
mate.  The  reliability  values  are  determined  for  a  generic  uniaxial  steel  structure  loaded 
in  tension  similar  to  a  helicopter  pitch  link  component  by  applying  Miner’s  Linear 
Damage  Rule.  The  six  component  fatigue  test  values  were  obtained  from  Arden1  where 
the  maximum  applied  stress  (S)  on  the  component  is  tabulated  with  respect  to  cycles  to 
failure  (N).  In  order  to  obtain  an  SN  curve  to  represent  the  component  fatigue  test 
results,  a  separate  regression  analysis  was  applied  to  a  larger  set  of  coupon  tests  of  a 
steel  for  which  the  results  are  tabulated  in  Bury.1  The  assumed  spectrum  load  used  in 
determining  the  lifetime  estimate  was  obtained  from  Berens.11  Note  that  only  the  six  com 
ponent  fatigue  test  values  are  from  Reference  1  and  the  remaining  test  values  are  from 
References  10  and  11. 


THE  COUPON  TEST  SN  CURVE 

This  section  describes  the  procedure  for  determining  an  SN  regression  curve  to  represent 

coupon  fatigue  test  data,10  as  shown  in  Figure  1.  The  assumed  functional  representation12 
of  the  data  is 

S  =  s.+  (Su  -  S„)e  oN)y 


where  S  is  the  maximum  applied  stress  and  N  is  the  number  of  cycles  required  for  the 
coupon  to  fail.  S.  is  the  coupon  endurance  limit  representing  the  case  when  N  « 
and  Su  represents  the  static  strength  of  the  coupon;  i.e.,  the  strength  for  N  =  1  The 

S,nX,?L  ^  .N  dCtCrmin'dJ  ft  ? and  r ■  S.,  S„,  fi,  and  r  were  determined  from 

J?!  w  .  cv°f  n  IMSI- comPu,er  code  for  solving  nonlinear  regression  problems.  The 

test  values^  **  Sh°Wn  *n  FigUre  1  (solid  Une)  with  the  individuaI  coupon  fatigue 
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Logic  N  Cycles 


Figure  1.  Mean  coupon  SN  curve/fatigue  data 

A  review  of  the  literature  on  the  determination  of  component  fatigue  life  showed  that 
various  functional  representations  similar  to  Equation  1  have  been  applied  where  N  is  the 
independent  variable  and  S  is  the  dependent  (response)  variable.  This  is  counter  to  the  con¬ 
ventional  functional  representation  of  test  data  where  S  would  be  the  independent  variable 
in  the  analysis  since  a  fixed  cyclic  load  (stress)  value  is  applied  and  a  resultant  (dependent) 
number  of  cycles  to  failure  is  recorded.  In  order  to  obtain  N  as  the  dependent  variable, 
Equation  1  can  be  inverted  resulting  in  the  following: 


log10  N  =  e {log  1 1 “ ,og  ((S  "  s» )  ■ '  <s»  ~s-))l  -iog/?I/y  ^ 

Although  Equation  2  is  recommended  in  determining  the  functional  representation  of  the 
data,  Equation  1  was  applied  in  this  study  since  it  is  commonly  used  in  engineering  fatigue 
analysis,  and  the  qualitative  measure  of  the  relative  uncertainties  in  determining  the  reliabil¬ 
ity  at  a  specified  lifetime  are  not  affected  by  the  SN  curve  assumption. 

In  order  to  simplify  the  analysis,  the  fatigue  data  from  Reference  10  was  normalized 
with  respect  to  the  estimated  S„  value  determined  from  the  initial  application  of  regres¬ 
sion  analysis.  Another  SN  curve  was  then  obtained  from  the  normalized  data,  where  /J, 
y,  and  Su  were  obtained  for  a  known  S.  of  1.  The  resultant  SN(N)  curve  is  shown  in 
Figure  2.  The  figure  also  shows  the  regression  results  SN(S)  from  the  application  of 
Equation  2. 
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THE  COMPONENT  SN  CURVE 

Usually  the  shape  of  the  component  SN  curve  is  obtained  from  a  prior  coupon  SN(N) 
urve,  as  shown  in  Figure  2.  The  location  (ordinate  position)  of  the  curve  is  determined 
om  extrapolating  the  individual  component  values,  as  shown  in  Figure  3,  to  N  =  108  cycles, 
lie  original  component  values  in  Reference  1  have  been  rescaled  so  that  they  have  scales 
milar  to  the  S  values  in  Figure  2.  The  extrapolation  process  involves  vertically  positioning 
te  coupon  SN  curve  (see  Figure  3)  to  agree  with  the  individual  component  values  and  then 
(tending  the  curves  to  N  =  108  cycles.  S;  values  are  obtained  for  N  =  108  and  the  compo- 
ent  curves  mean  stress  position  at  N  is 


Sn  —  2 S j  /n , 


(3) 


here  n  is  the  number  of  component  test  results.  The  solid  line  in  Figure  4  shows  the  repre 
sntative  component  SNC  curve  and  component  test  data.  Since  there  are  usually  only  six 
jmponent  test  results  available,  because  of  the  costs  in  component  testing,  the  above  proce- 
ure  is  often  applied.  Using  the  more  extensive,  less  expensive  coupon  test  results  to  deter- 
une  the  shape  of  the  SN  curve  assumes  similar  material,  test,  and  environment  for  both 
jupon  and  component. 
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Log  10  N  Cycle* 


Figure  3.  Extrapolation  of  component  data 


Figure  4.  Mean  SN  curve  for  component  test  values. 
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SPECTRUM  LOAD 


The  normalized  spectrum  loading  used  in  the  fatigue  life  analysis  is  shown  in  Figure  5a  The 
loading  was  obtained  from  a  rainflow  count  of  a  modified  combat  history  described  in  Refer¬ 
ence  11.  The  spectrum  was  determined  by  the  number  of  loads  within  discrete  range  incre¬ 
ments.  The  spectrum  is  simplified  to  five  loads  {Lj}  \  by  expanding  the  size  of  the  range 
increments  and  including  the  appropriate  cycle  count  {nj}  \  within  each  expanded  range. 

The  normalization  procedure  involved  dividing  each  Lj  by  the  smallest  damaging  load  S«  (endur¬ 
ance  limit).  This  simplification  was  adequate  for  identifying  the  spectrum  effects  in  this  study. 


MINER’S  RULE 

In  order  to  obtain  the  Ufetime  estimate  from  the  simplified  fatigue  load  (L)  and  the  normal¬ 
ized  material  strength  (S)  data  shown  in  Figures  5a  and  5b,  the  following  linear  damage  rule9  is 
applied  where 


is  the  damage  fraction  for  each  pass  or  repetition  of  the  spectrum.  This  representation  of  opera 
tion  hours  is  described  in  Reference  10.  The  n(i)s  are  the  number  of  cycles  corresponding  to 
the  applied  load  L(i),  as  shown  in  Figure  5a.  The  N(i)  values  are  obtained  from  the  SN  curve, 
as  shown  in  Figure  5b,  where  the  corresponding  Sj  values  are  identified  in  the  figure  by  the  L(i) 
values  obtamed  from  the  spectrum  loads  in  Figure  5a.  In  addition,  the  rule  requires  that 


in  order  to  determine  the  maximum  number  of  passes  (Np)  that  can  occur  prior  to  the 
component  failure.  - 


Figure  5b.  Component  mean  SN  curve. 
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SIMULATION  PROCEDURES  IN  DETERMINING  COMPONENT  RELIABILITY 


Bootstrap  Method  Applied  to  Coupon  SN  Curve  Computation 

The  Bootstrap  Method,14  a  simulation  process,  was  introduced  in  the  fatigue  life  reli¬ 
ability  analysis  in  order  to  examine  the  effects  of  uncertainties  used  in  determining  the  cou¬ 
pon  SN  curve  and  the  resultant  component  reliability.  Only  one  reliability  estimate  can  be 
obtained  from  a  single  set  of  data;  however,  even  with  all  conditions  the  same,  one  would 
expect  to  determine  a  different  reliability  estimate  from  another  set  of  data.  The  Boot¬ 
strap  Method  provides  a  technique  for  estimating  the  variability  among  random  sets  of 
data  generated  under  equivalent  conditions  using  data  from  only  a  single  random  sample. 
The  idea  is  to  create  arbitrarily  many  new  datasets  by  sampling  with  replacement  from  the 
original  data.  If  there  are  n  values  in  the  original  data,  then  a  new  dataset  is  created  by 
selecting  n  values  from  among  these  observed  data,  allowing  data  values  to  be  selected 
more  than  once.  The  probability  distribution  of  the  reliability  calculated  from  these 
datasets,  which  are  created  by  taking  random  samples  from  the  single  observed  dataset, 
provides  an  estimate  of  the  actual  probability  distribution  of  reliability  which  could,  in 
principle,  be  determined  from  future  datasets. 

The  material  fatigue  testing  involves  obtaining  the  number  of  cycles  to  failure  for  a 
specified  applied  load  (S)  shown  as  the  individual  data  points  in  Figure  1.  The  Bootstrap 
Method  involves  selecting  a  random  set  of  9  values  independently  with  replacement  from  the 
set  of  cycles  to  failure  values  {Nj(i)>  l  for  each  jth  applied  stress  from  {Sj}  12 ,  as  shown  in 
Figure  1  and  obtained  from  Reference  12.  The  result  is  a  new  set  (N*(i)}  j  for  each  of  the  Sj 
values. .  The  new  set  is  called  the  Bootstrap  sample  where  some  values1  can  be  repeated  once, 
twice,  or  more  times.  The  new  set  is  then  used  in  the  regression  procedures  described  in  the 
Coupon  Test  SN  Curve  Section  in  order  to  obtain  a  new  SN  curve  (S  in  Equation  1). 

In  Figure  A1  (see  Appendix),  the  results  of  applying  the  Bootstrap  show  a  90%  confi¬ 
dence  band  on  the  original  SN(N)  curve.  Results  in  Figure  A2  show"  the  individual  SN(N) 
curves  obtained  for  the  Bootstrap  samples.  The  results  from  Figures  A1  and  A2  indicate  that 
there  is  more  variability  for  large  or  small  N  values  than  for  the  central  region  of  the  curve 
which  is  consistent  with  determining  confidence  bands  on  regression  curves. 

For  calculating  the  effects  of  coupon  SN  curve  uncertainties,  a  damage  fraction  (DF*) 
value  is  computed  by  applying  the  linear  damage  rule.  The  above  procedure  is  repeated  Mq 
times,  so  that  a  set  of  {DF£(i)}^B  are  obtained.  The  component  reliability  R  can  then  be 
obtained  by  counting  the  number  (Nb)  times  Np  •  DF*  <  1,  k  =  1,2,..., Mb  ,  where  Np,  the 
number  of  passes,  is  specified.  The  computed  component  reliability  R  including  uncertainties 
in  the  coupon  testing  procedure  is  written  as 

R  =  Nb/Mb,  (6) 

where  Mb  is  the  number  of  repeated  applications  of  the  Bootstrap  procedure. 

Reliability  Estimates  from  SN  Component  Curve  Simulations 

The  following  simulation  procedure  was  applied  in  order  to  identify  the  effects  of 
uncertainties  in  the  location  of  the  component  SNC  curves  on  the  reliability  estimates. 

'Represents  simulation  results. 
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The  uncertainties  are  assumed  because  of  the  potential  differences  in  loading,  material,  sur¬ 
face  conditions,  and  geometry  between  the  coupon  and  component.  Also  contributing  to 
these  uncertainties  are:  the  extrapolation  of  the  component  fatigue  data  in  determining  the 
Sj  s,  as  shown  in  Figure  3,  and  the  availability  of  only  six  values  in  computing  Sm  (mean  of 
the  curve),  as  shown  in  Figure  4.  Examination  of  potential  inaccuracies  in  the  reliability  com¬ 
putations  due  to  assuming  that  the  component  and  coupon  SN  curve  shapes  are  similar  was 
not  included  in  the  simulation  process.  Introducing  variability  in  the  curve’s  location  was  suffi¬ 
cient  for  showing^  sensitivity  in  the  reliability  computation.  In  the  simulation  process,  a  ran¬ 
dom  set  of  M  S  ^  values  were  obtained.  These  values  are  normally  distributed  about  the 
Sm  value  in  Figure  4  from  the  following: 


sm(0  *  Sn,(l  +  V$  •  Zj),  i  *  1,2,...,M ,  p 

where  the  ZjS  are  values  randomly  selected  from  a  standard  normal  distribution  with  a  mean 
of  0  and  a  variance  of  1.  The  Vs  value  is  the  coefficient  of  variation  (CV)  and  the  mean  is 
Sm.  In  Figure  A3,  a  representative  normally  distributed  set  of  S*  values*  are  shown  for  Vs 
*  0.01  and  Vs  *  0.02.  The  nfewly  obtained  mean  values  (S  are  now  used  in  vertical  posi¬ 
tioning  of  the  component  SN  curve,  as  shown  in  Figure  4,  so  that  M  SN  curves  can  be 
obtained  from  Equation  1  by  the  following: 


S*  =  S  (S„,  Su,/J,  y)  +  A  Pj ,  i  =  1, 2, ..., M  , 


(8) 


where  A  Pj  -  S  ^  (i)  -  Sm.  M  damage  fraction  values  (DF*)  are  obtained  from  applying  the 
procedures  described  in  the  Miner  s  Rule  Section  and  the  schematics  shown  in  Figures  5a  and 
5b  using  the  newly  available  S  *  values. 


From  Miner’s  Rule,  compute  NP  •  DF,*,  i  =  1,2,...,M  and  record  the  number  (Ns)  of 
times  Np  •  DFj*<  1  for  a  given  NP  value,  where  NP  represents  the  specified  number  of 
passes.  The  component  reliability  R  can  be  written  as 


R  =  Ns/M .  (9) 

Note  that  in  order  to  obtain  0.999999  reliability,  M  =  1  x  106  simulations  would  be  required. 


Load  Uncertainties  Effect  On  Reliability  Computations 

A  simulation  procedure  similar  to  that  described  in  the  previous  section  was  applied  in 
order  to  identify  the  sensitivity  in  computing  component  reliability  by  introducing  uncertainties 
in  the  assumed  spectrum  loads  (see  Figure  5a).  There  exist  potential  errors  involved  in  assum¬ 
ing  a  specific  load  spectrum.13  They  are  the  results  of:  an  inaccurate  measuring  device,  the 
location  of  the  device,  and  assuming  load  patterns  determined  from  short  periods  of  data 
recording  which  differ  from  the  actual  loads  the  component  would  be  subject  to  during  its 
operational  lifetime. 


Application  of  the  simulation  process  involved  only  modeling  uncertainties  in  the  L  val¬ 
ues,  with  n(i)s  remaining  constant  for  a  given  load.  Introducing  the  same  amount  of  variabil¬ 
ity  m  each  (L(i)} 3  values  was  sufficient  to  show  the  sensitivity  of  the  reliability  estimates 
from  uncertainties  in  the  loading. 
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Initially,  the  simulation  involves  obtaining  Ml  sets  where  the  jth  set  {L?(i)}?_  j  is  deter¬ 
mined  from  the  following: 

Lj(i)  =  L(i)(l  +  VL  •  Zj ),  i  =  1,2,...,5  (10) 

where  j  =  1,2,..., Ml  and  Zj  is  a  random  value  from  a  standard  normal  N(0,1)  distribution. 

Vl  is  the  coefficient  of  variation  representing  an  assumed  variability  in  load  L(i). 

For  the  jth  simulation,  the  original  five  loads  (L(i)}  j,  as  shown  in  Figure  5a,  are  modified 
resulting  in  a  new  set  {LJ(i)}?  from  Equation  10.  The  distribution  of  L*(l)  for  all  j  values, 
for  example,  would  be  similar  to  that  for  Sm,  as  shown  in  Figure  A3.  J 

In  the  simulation  process,  the  jth  modified  set  and  its  associated N  *,  determines  a  dam¬ 
age  fraction  value  DFJ,  as  described  in  the  Miner’s  ilule  Section  and  Figures  5a  and  5b.  In 
order  to  obtain  component  reliability  values  from  the  load  variability,  Miner’s  Rule  is  then 
applied  by  recording  the  number  (Nl)  of  times  NP-DF,*  <  1  for  j  =  1,2,.. .Ml.  The  compo¬ 
nent  reliability  R  is  then  written  as 


R  =  Nl/M1. 


(11) 


Reliability  Sensitivity  from  Uncertainties  in  Miner’s  Rule 

A  simulation  procedure  similar  to  those  in  the  previous  two  sections  is  applied  to  the 
Miner’s  Rule  relationship  in  Equation  5.  This  was  done  in  order  to  examine  the  effects  of  a 
possible  error  in  assuming  the  component  will  fail  when  Np  •  DF  =  1  (see  Equation  5).  In 
order  to  identify  the  effects  of  this  uncertainty  in  computing  component  reliability  R,  the  fol¬ 
lowing  simulation  process  was  performed: 

Initially,  the  value  1  in  Equation  5  is  replaced  by  a  set  of  random  numbers  {CRj}  j*2 
resulting  in  Np  •  DF  *  <  CRj ,  where 

CRj  =  1  +  VM*Zj,i  =  1,2,..., M2  (12) 

and  Vm  and  Zj  are  the  assumed  coefficient  of  variation  and  standard  normal  as  previously 
defined  in  the  above  two  sections. 

The  reliability  R  is  determined  from  recording  the  pumber  (Nz)  of  times  that 

Np-DF*  <  CRj,  (13) 

and  then  defining 

R  =  Nz  /M2  (14) 

where  M2  is  the  number  of  simulations. 
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WORKING  SN  CURVE 

The  adjustment  of  the  mean  component  SNC  curve  from  a  limited  amount  of  component 
test  data  results  in  a  certain  amount  of  variability  in  estimating  the  location  of  the  curve.  In 
order  to  account  for  this  variability,  and  in  some  instances  other  uncertainties  in  the  fatigue 
analysis  process,  a  component  SNC  curve  reduction  factor  is  often  introduced  which  results  in 
a  new  working  SNW  curve,  as  shown  in  Figure  A4.  There  is  no  standard  method  for  obtain¬ 
ing  a  working  curve  in  the  helicopter  industry.1  The  working  curve  in  Figure  A4  was 
obtained  by  a  uniform  reduction  in  all  Sc  values.  This  approach  maintains  the  same  curve 
shape  as  in  the  original  SNC  curve;  i.e.,  the  coupon  SN  curve  shape.  This  approach  is  consis¬ 
tent  with  the  use  of  the  coupon  curve  shape  in  the  extrapolation  process  for  each  component 
data  value  (see  Figure  3)  by  which  the  original  component  curve  Sm  value  (see  Figure  4)  is 
obtained.  In  Figure  3,  a  schematic  of  this  uniformity  is  shown  where  for  N  =  1  and 
N  *  10  show  an  equal  amount  of  assumed  dispersion  in  the  Sj  values. 

Reduction  Factors  for  Working  Curves 


Some  of  the  reduction  factors  commonly  used  by  the  helicopter  manufacturers  are  dis¬ 
cussed  in  References  15  and  16.  In  some  cases  a  multiplication  factor  is  used  to  obtain  work¬ 
ing  curve  values,  S*;  i.e., 


Sw  —  Sc  -  P  •  S, 


m 


(15) 


where  Sc  represents  the  strength  values  from  the  component  curve,  SNC  for  various  P  values. 
Sm  was  previously  defined  in  Equation  3. 


Another  reduction  procedure  involves  defining 


Sw  —  Sc  —  3  •  SD  | 

where  the  standard  deviation  (SD)  is  often  determined  from  an  assumed  standard  coefficient 
of  variation  for  a  particular  material  to  represent  the  Sj  values  shown  in  Figure  3  and  in 
Equation  3.  A  typical  value  for  the  coefficient  of  variation  for  steel  is  7%.  The  SD  value 
is  then  written  as  SD  =  0.07  «  Sm.  One  other  method  involves  determining  SD  from  the 
actual  S;  values;  i.e.,  SD  =  ^  (2  (S|  -  Sm)2  /  (n  -  1)  *  an(*  substituting  the  SD  value  in 

The  working  curve  was  introduced  in  this  report  in  order  to  evaluate  its  capability  to 
include  the  possible  variability  in  the  reliability  estimates  from  the  simulation  results. 


RESULTS  AND  DISCUSSIONS 

In  this  section,  results  from  the  simulation  procedures  are  shown  in  both  tabulated  and 
graphical  form.  Variability  is  introduced  in  combination,  as  well  as  individually,  for  all  of  the 
following  four  factors:  the  spectrum  load,  the  mean  SN  Curve,  Miner’s  Rule,  and  the  Boot¬ 
strap  process. 
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In  Table  1  all  four  factors  were  varied  for  a  range  of  CV  values  (%  variability)  from  1% 
to  5%,  except  for  the  Bootstrap  simulation  where  the  variability  is  obtained  from  coupon  test 
results.  The  component  reliability  results  are  tabulated  as  a  function  of  the  corresponding 
CV  values  assumed  in  the  simulation  procedures.  The  results  were  obtained  by  systematically  ran¬ 
domly  selecting  values  from  each  of  the  four  factors  so  that  1  x  106  distinct  factor  combinations 
are  obtained  for  computing  the  damage  fraction  (DF)  in  the  Miner’s  Rule  Section.  The  reliabil¬ 
ity  (R)  is  then  obtained  from  the  sum  of  all  the  times  Np  •  DF*  <  1  divided  by  1  x  106. 


Table  1.  RELIABILITY  VERSUS  FACTOR  VARIABILITY:  LIFETIME  *  3425 


%  Variability* 

Reliability 

1.0 

0.999999 

2.0 

0.989676 

3.0 

0.937250 

4.0 

0.872101 

5.0 

0.816061 

•Simultaneous  variability  assumed  for  the  following:  spectrum  load,  mean  curve, 
Miners  Rule  =  factor  (1)  and  the  Bootstrap  process  on  defining  mean  curve  are 
applied. 


In  order  to  apply  the  simulation  procedures,  a  1%  variability  was  introduced  for  each  of 
the  factors  and  the  number  of  passes  (NP  =  3425)  was  selected  in  order  to  obtain  a  baseline 
reliability  value  of  0.999999.  This  value  was  selected  because  of  the  helicopter  industry’s  inter¬ 
est  in  obtaining  high  component  reliability  of  0.999999. 

The  results  in  Table  1  show  a  substantial  instability  when  comparing  the  reliability  estimate 
of  0.999999  versus  0.989676  for  the  respective  1%  and  2%  variabilities.  The  implication  of  these 
results  is  that  in  one  case  one  in  a  million  failures  could  occur  compared  to  10324  failures  in  a 
million  in  the  other.  This  substantial  difference  for  such  a  small  increase  in  the  inherent  variabil¬ 
ity  in  the  assumed  fatigue  life  models  shows  a  severe  sensitivity  in  computing  high  reliability 
when  there  is  a  small  degree  of  uncertainty  in  determining  spectrum  loads,  SN  curves,  and  assum¬ 
ing  a  failure  requirement  from  Miner’s  Rule.  The  results  from  increasing  the  variability  from  3% 
to  5%  show  a  corresponding  reduction  in  reliability  values.  The  R  =  0.816061  for  5%  variability 
is  a  very  large  reduction  from  the  original  0.999999  for  1%  variability.  The  CV  values  shown  in 
Table  1  represent  a  range  of  potential  parameter  uncertainties  in  the  fatigue  life  model. 

In  Table  2,  reliability  values  are  tabulated  as  a  function  of  the  combined  and  individual  vari¬ 
ability  of  the  four  factors.  This  was  done  in  order  to  examine  the  effects  of  the  individual  factor 

variability  on  computing  component  reliability.  The  1%  variability  was  applied  to  all  factors  result¬ 
ing  in  R  =  0.999999  when  NP  is  equal  to  3425  (as  in  Table  1  at  1%).  The  2%  variability  was 

applied  to  each  factor  individually  with  1%  variability  for  the  other  two  factors.  The  Bootstrap 
process  was  applied  in  all  of  the  cases.  Introducing  a  2%  CV  in  the  spectrum  load  (SPL)  shows 
a  substantial  reduction  in  the  reliability  estimate  from  0.999999  to  0.996404.  The  2%  variability 
in  the  component  SNC  curve  (MSN)  shows  a  smaller  reduction  of  0.999999  to  0.999440  indicating 
that,  based  upon  the  particular  spectrum  considered,  the  spectrum  load  uncertainties  could  result 
in  greater,  instability  in  the  reliability  values.  Small  variations  in  the  Miner’s  Rule  assumption 
(see  Equation  13)  do  not  appear  to  be  as  critical  in  the  reliability  computations.  Increasing  the 
variability  from  3%  to  5%  shows  a  continued  decrease  in  reliability  estimates  except  for  the  case 
of  Miner’s  Rule  variability  which  has  a  very  small  reduction.  The  5%  variability  on  the  spectrum 
load  shows  a  value  R  =  0.862469  which  is  only  5.7%  greater  than  the  case  where  all  factors  were  var¬ 
ied  simultaneously,  as  shown  in  Table  1  for  5%  variability. 


37 


Table  Z  RELIABILITY  VERSUS  INDIVIDUAL  FACTOR  VARIABILITY:  LIFETIME  =  3425 


%  Variability  (P) 

Reliability 

on  inatviouAi 
Factors* 

SPL 

MSN 

MR 

1.0 

0.999999 

0.999999 

0.999999 

£0 

0.996404 

0.999440 

0.999998 

3.0 

0.967356 

0.992375 

— 

4.0 

0.912587 

0.972164 

0.999997 

5.0 

0.862469 

0.941979 

0.999994 

*1%  variability  is  applied  to  all  factors  except  ter  Individual  increase  in  factor  variability  (P) 
in  first  column.  Bootstrap  process  also  included. 


In  Table  3,  reliabilities  are  obtained  for  the  individual  factors,  spectrum  load  (SPL),  and 
3cation  component  SN  curve  (MSN).  In  order  to  obtain  the  R  =  0.999999  value  for  1% 
a  liability  on  each  of  the  factors,  the  number  of  passes  (Np)  was  3700  for  SPL  and  4425  for 
4SN.  The  lower  NP  value  for  SPL  is  consistent  with  the  results  in  Table  2  since  the  R  val¬ 
es  for  SPL  were  lower  than  those  for  MSN  when  Np  was  3425.  In  addition,  it  is  obvious 
hat  a  lower  number  of  cycles  of  operation  would  usually  increase  the  reliability  value.  The 
lootstrap  Method  application  resulted  in  a  value  of  R  *  0.999977  when  combined  with  a  1  % 
ariability  in  MSN.  This  indicates  that  the  method  is  not  introducing  any  substantial  variabil- 
ty  compared  to  the  SPL  and  MSN  contribution  in  determining  R.  This  is  expected  because 
»f  the  small  amount  of  variability  in  the  SN  curves,  as  shown  in  Figures  A1  and  A2.  In  addi- 
ion,  the  range  of  cycle  values  contributing  the  most  in  determining  the  damage  fraction  has 
he  least  amount  of  variability. 


Table  3.  RELIABILITY  VERSUS  INDIVIDUAL  FACTOR  VARIABILITY  /  LIFETIME 


%  Variability 

Reliability  (R) 

SPL* 

MSNt 

1.0 

0.999999 

0.999999 

zs 

0.969376 

0.965875 

5.0 

0.828010 

0.818789 

*3700  Lifetime  value 
14425  Lifetime  value 

NOTE:  Application  of  Bootstrap  process  simulation  resulted  in  R  *  0.999977 
with  1%  variability  for  MSN. 

Table  4  shows  the  reliability  results  from  reducing  the  Sm  value,  shown  in  Figure  4  and  Equa- 
ion  3,  by  the  tabulated  percentage  in  order  to  examine  the  possible  material  mean  strength  loss 
rom  environmental  effects  such  as  corrosion.  New  values  equal  (1  -  p/100)Sm  where  p  is  the 
abulated  percent  reduction  factor.  In  the  case  where  p  =  0,  R  =  0.999999  was  obtained  vary- 
ng  the  SNC  curve  by  1%  with  Np  =  4425  which  is  in  agreement  with  the  result  in  Table  3. 

Tus  variability  in  the  SNC  curve  (MSN)  was  maintained  for  each  of  the  reduced  Sm  values. 

Vhen  p  =  1,  then  0.99Sn,  was  used  in  the  simulation  process  to  obtain  a  reliability  value  equal 
o  0.999852  compared  to  0.999999  for  no  reduction  in  Sm.  This  result  is  not  as  substantial  a 
eduction  in  R  as  the  case  where  the  S„  value  is  reduced  by  5%  and  R  =  0.324206.  The  over- 
11  results  indicate  that  loads  which  previously  did  not  increase  the  damage  fraction  are  now  signif- 
-ant  contributors  in  reducing  the  component  reliability.  If  there  is  a  potential  for  material 
trength  loss  due  to  corrosion,  for  example,  then  high  reliability  estimates  are  substantially 
educed  by  small  mean  strength  reduction. 
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Table  4.  RELIABILITY  VERSUS  PERCENT  REDUCTION  MSN: 
LIFETIME  =  4425 


%  Reduction 

Reliability 

0.0 

a  ryyyyyo 

1.0 

0.999852 

2.0 

0.995542 

3.0 

0.946600 

4.0 

0.720650 

5.0 

0.324206 

NOTE:  1%  variability  on  MSN 


Table  5A  shows  the  deterministic  fatigue  lifetime  values  obtained  from  the  application 
of  various  working  curves  described  in  the  Working  SN  Curve  Section.  This  computation 
was  introduced  to  evaluate  the  curves  relative  effectiveness  in  accounting  for  the  uncertain¬ 
ties  in  estimating  the  component  SNC  curve.  This  evaluation  involves  comparing  results 
from  Tables  5A  and  5B.  In  Table  5A,  P  =  0.5  which  is  the  reduction  from  Sc  in  Equa¬ 
tion  15.  This  shows  a  lifetime  of  0.325,  which  is  a  very  conservative  estimate  compared  to 
the  6150  passes  obtained  from  using  the  original  component  curve  without  a  reduction. 

The  least  conservative  lifetime  estimate  is  2000  which  was  obtained  from  reducing  the  com¬ 
ponent  curves  by  three  SDs.  SD  was  obtained  by  using  the  Sj  values  in  Figure  3  and 
Equation  3.  This  estimate  was  less  conservative  than  the  1225  lifetime  value  obtained 
using  an  assumed  CV  =  0.07.  The  extrapolation  process  shown  in  Figure  3  may  account 
for  the  relatively  low  SD  estimate  for  the  case  when  the  life  value  is  2000.  The  other 
reduction  factors  result  in  a  predictable  decrease  in  the  life  estimate  with  an  increase  in 
the  reduction  percent  P. 


Table  5A.  UFETIME  VALUES  FROM  APPUCATION  OF 
WORKING  CURVES 


Working  Curve 
(Adjustment  on  S) 

Lifetime  Value 

0.50* 

0.325 

0.44 

48 

0.30 

500 

0.25 

850 

0.20 

1355 

S-3(sd)t 

1225 

S-3(sd) 

2000 

NAO 

6150 

•Percent  reduction  of  (P)  on  S:  where  (1-P)S  is  location  of  working  curve 
and  S  is  mean  component  strength  at  endurance  limit. 
tStandard  deviation  determined  from  assuming  7%  coefficient  of 
variation  tor  S. 

ONA:  No  adjustment  of  SN  curve. 
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Table  5B.  LIFETIME  VALUES  WITH  0.999999  RELIABILITY 
VERSUS  VARIABILITY  ON  MSN  AND  SPL 


%  Variability 

Lifetime  Value 

1.0 

3425 

2.0 

1850 

3.0 

875 

4.0 

350 

5.0 

50 

In  Table  5B,  simultaneous  variability  on  the  component  curve  (MSN)  and  the  spectrum 
oad  (SPL)  for  0.999999  reliability  shows  a  reduction  in  the  lifetime  value  with  increasing  vari- 
bility,  which  is  consistent  with  prior  results.  By  comparing  results  from  Tables  5A  and  5B 
he  effectiveness  of  the  working  curve  in  obtaining  0.999999  reliability  can  be  identified. 

That  is,  for  example,  a  1%  variability  shows  3425  indicating  that  any  of  the  working  curves 
ould  provide  the  required  reliability  although  the  curve  obtained  from  the  three  SD  reduc- 
ions  would  be  the  least  conservative  acceptable  method.  Introducing  2%  variability  shows  a 
ife  estimate  of  1850  which,  in  this  case,  requires  using  the  three  SD  reduction  procedure 
rhere  SD  is  obtained  from  assuming  a  0.07  CV  value.  If  the  variability  is  assumed  to  be 

then  a  lifetime  value  of  50  is  obtained  which  would  require  a  working  curve  reduction 
actor  of  0.44  in  order  to  provide  the  0.999999  reliability.  If  a  5%  variability  in  the  loading 
md  SN  curve  can  exist,  then  most  of  the  working  curve  procedure  would  be  an  undesirable 
nethod  for  obtaining  high  reliability. 

Using  Equation  7,  the  results  of  introducing  a  1%  uncertainty  in  the  positioning  of  the 
omponent  curve  is  shown  in  Figure  6  as  a  probability  density  function  for  the  lifetime  esti- 
nate  (Np  =  1/DF)  determined  from  Equation  4.  A  7.3%  coefficient  of  variation  was 
tbtained  with  a  mean  life  of  6194.  The  inner  range,  Np  ±  3  •  SD,  is  4964  to  7689  when 
he  function  is  assumed  to  be  log-normal;  this  is  a  substantial  variability  in  the  life  estimate 
or  a  very  small  amount  of  variability  in  the  location  of  the  SN  curve. 

In  Figure  7,  a  density  function  for  the  life  estimate  was  obtained  from  an  assumed  5% 
variability  using  the  same  procedures  as  described  above.  In  this  case,  the  CV  was  37.5% 
with  a  mean  equal  to  6621.  The  inner  three  SD  range  is  2065  to  18587  for  the  lifetime 
'alue  estimates.  This  exceptionally  large  dispersion  in  the  life  estimates  for  a  moderate 
unount  of  variability  (5%)  in  the  location  of  the  mean  curve  indicates  instability  in  estimating 
ifetime  values.  It  is  noted  that  by  taking  the  log  of  the  data,  a  normal  function  was 
ibtained  indicating  that  the  fatigue  estimate  can  be  represented  by  a  log-normal  distribution. 

In  Figure  8,  a  computation  similar  to  that  described  in  Figure  6  was  performed  in  order 
o  determine  the  difference  in  life  values  between  1%  and  0.0001%  points  corresponding  to 
eliabilities  of  0.99  and  0.999999,  respectively.  A  1%  variability  in  the  spectrum  was  assumed 
n  the  computation  of  Np.  A  CV  of  10.8%  was  obtained  with  a  mean  of  6203.  Results 
how  a  life  of  4795  for  the  lower  reliability  of  0.99  and  3689  for  the  higher  reliability  of 
1.999999  showing  a  23%  decrease  in  the  lifetime  estimate. 
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Frequency 
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Figure  6.  Component  fatigue  life  probability  density  function. 


Lifetime 

Figure  7.  Component  fatigue  life  prability  density  function. 
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Figure  8.  Component  fatigue  life  probability  density  function  and  reliabilty. 

..  ?®u.re  (wj*erf  a  5%  variability  in  the  spectrum  load  was  introduced,  shows  a  log-normal 
distribution  of  lifetime  values  similar  to  that  in  Figure  7  for  the  SNC  curve  variability.  The 
inner  range  of  1075  to  31956  again  shows  the  substantial  variability  in  the  life  estimate  indicat¬ 
ing  a  serious  instability  in  the  fatigue  life  computation  approach  when  even  small  uncertain¬ 
ties  exist  in  assuming  a  specified  spectrum  load.  Load  spectrum  and  fatigue  strength  CVs  in 
the  range  of  7%  to  13%  are  being  considered  by  the  helicopter  industry.17  A  comparison 
si  the  reliabilities  of  0.99  and  0.999999  for  the  respective  lifetimes  showed  1702  and  448 
passes  which  is  a  74%  decrease  in  lifetime.  This  is  a  much  greater  percent  decrease  than 
that  of  the  1%  variability  case  in  Figure  8.  This  assumed  variability  is  probably  more  realistic 
than  that  of  1%  which  was  previously  assumed. 

Comparison  of  these  figures  show  uncertainties  in  safe  life  fatigue  design  in  terms  of 
:hanges  in  design  lifetime  for  a  fixed  reliability,  whereas  the  results  in  Tables  1  through  4 
‘how  variability  in  terms  of  changes  in  reliability  for  fixed  lifetimes. 

Although  only  a  simple  case  has  been  considered,  the  modeling  and  simulation  processes 
are  capable  of  dealing  with  more  complex  safe  life  fatigue  designs.  Such  designs  could 
include  more  complex  load  spectra  and  additional  parameters  in  the  fatigue  life  model.  The 
ralue  of  any  high  reliability  based  analyses,  whether  simple  or  complex,  appears  to  be  in 

question  in  view  of  the  very  substantial  sensitivity  of  the  reliability  and  lifetime  results  from 
this  study. 
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Figure  9.  Component  fatigue  life  probability  density  function  and  reliability. 


CONCLUSIONS 

A  small  amount  of  variability  (uncertainty)  in  load  or  strength  in  the  safe  life  fatigue 
model  can  result  in  a  substantial  reduction  in  high  reliability  values  for  a  specified  lifetime  of 
a  component.  These  uncertainties  can  also  result  in  very  unstable  lifetime  estimates  for  a 
given  reliability.  In  contrast,  the  small  variations  assumed  in  the  Miner’s  Rule  criterion,  and 
the  variability  in  the  SN  coupon  curve  determination,  caused  a  minimal  amount  of  change  in 
the  reliability  estimates. 

A  small  percent  reduction  in  the  strength  values  in  the  component  SN  curve;  for  exam¬ 
ple,  corrosion  effects,  can  result  in  a  large  decrease  in  the  reliability  values. 

Introducing  working  curves  in  the  fatigue  life  computation  is  only  effective  when  there  is 
a  small  amount  of  variability  in  the  SN  component  curve  or  when  the  reduction  factor  was 
very  large. 

In  view  of  the  sensitivity  of  the  safe  life  reliability  criterion  of  0.999999  to  the  modest 
variability  considered  in  this  analysis,  it  appears  that  the  0.999999  reliability  is  ineffective  as  a 
criterion  to  ensure  safety  for  a  specified  service  life.  In  summary,  this  report  has  identified 
a  potential  problem  associated  with  obtaining  a  meaningful  quantitative  measure  of  reliabil¬ 
ity  for  a  fatigue  loaded  component. 
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Figure  A2.  Regression  SN  curves  from  Bootstrapping  (N-independent  variable). 
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Abstract 

Randomization  procedures  offer  a  viable  approach  to  the  analysis  of  ballistic  data  over  a 
wide  class  of  problems.  Distribution  assumptions  may  be  avoided  and,  of  even  greater 
importance,  random  samples  of  data  are  not  required.  Small  sample  sizes,  while  never 
welcome,  may  be  accommodated  as  well. 

Introduction 

This  is  an  applications  paper  that  details  a  problem  that  is  representative  in  many  respects 
of  those  engendered  by  ballistic  data.  Sample  sizes  are  woefully  small  due  to  the  cost  of  data 
collection  and/or  scarcity  of  testing  materiel.  The  samples  themselves  are  usually  nonrandom 
and  distribution  assumptions  are  tentative.  Historical  data,  when  available,  cannot  be  easily 
amalgamated  to  assist  in  inference. 

An  approach  known  generically  as  randomization,  suggested  by  Fisher  [2]  and  extended  to 
nonrandom  samples  by  Pitman  [5]  holds  particular  appeal,  since  distribution  assumptions  and 
random  sample  requirements  may  be  relaxed.  Edgington  [1]  asserts  that  "Few  experiments  in 
biology,  education,  medicine,  psychology,  or  any  other  field  use  randomly  selected  subjects, 
and  those  that  do  usually  concern  populations  so  specific  as  to  be  of  little  interest.  ...  The 
population  of  interest  to  the  experimenter  is  likely  to  be  one  that  cannot  be  sampled 
randomly."  Edgington’s  words  ring  true  in  the  example  to  follow. 


The  problem:  Stability  of  a  kinetic  energy  penetrator 

Kinetic  energy  penetrator  technology  has  undergone  a  metamorphosis  from  the  days 
when  solid  balls  were  launched  from  cannons  or  catapults  against  sailing  ships  and  forts.  The 
most  obvious  change  has  been  in  the  overall  configuration  of  the  projectile.  The  ratio  of  the 
projectile’s  length  to  its  diameter  has  gone  from  one  to  over  twenty,  as  illustrated  in  Figure  1. 
This  change  has  taken  place  largely  in  response  to  the  changing  targets  which  kinetic  energy 
penetrators  must  confront. 


_L 

) 
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Figure  1.  KE  Penetrator  Evolvement 

The  changes  in  tank  ammunition  design  have  followed  the  trend  depicted  in  Figure  2. 
Tlie  armor  piercing,  discarding  sabot  ammunition  (Fig.  2(b))  has  a  penetrator  whose  outer 
diameter  is  less  than  the  inner  bore  diameter  of  the  gun  tube.  The  difference  is  compensated 
for  by  a  sabot,  which  carries  the  penetrator  down  the  gun  tube  and  is  then  discarded.  The 
long  rod  penetrator  (Fig.  2(c))  is  essentially  a  long  rod  of  exceedingly  dense  material,  typically 
tungsten  alloy  or  depleted  uranium,  over  twice  as  dense  as  steel.  In  addition  to  a  discarding 
sabot,  the  penetrator  has  fins  which  increase  the  stability  of  the  rod  in  flight. 


Figure  2.  Armor  Piercing  Ammunition 

Table  1  contains  measurements  of  spin  rates  of  long  rod  penetrators  taken  by  Rapacki  [6]. 
The  natural  frequency  of  the  penetrators  is  about  120  cycles  per  second  (hz).  Spin  rates  close 
to  this  value  amplify  the  initial  manufacturing  imperfections  and  increase  in-flight  bending. 
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To  avoid  this,  the  fins  are  reconfigured  to  reduce  the  spin  rate  to  an  appropriate  level  below 
120  hz. 


Table  1.  Comparison  of  Two  Fin  Redesigns  with  a  Control 


design-0 

redesign-1 

redesign-2 

163.6 

97.5 

78.1 

109.0  . 

122.2 

76.7 

218.7 

108.2 

88.5  • 

143.2 

169.5 

If  the  spin  rate  is  too  high,  as  in  design-0  (or  control),  the  penetrator  may  become  bowed 
in  flight -sometimes  to  the  point  of  breaking -and  become  unstable.  Conversely,  if  the  spin 
rate  is  too  low,  the  penetrator  may  again  become  unstable.  An  optimal  spin  rate  cannot  be 
determined  analytically,  and  resources  are  not  adequate  for  extensive  empirical  study. 

An  engineering  consideration  with  important  implications  for  analysis  of  these  data  is  the 
following:  As  the  penetrator  becomes  more  stable,  the  variance  of  the  measured  spin  rates 
will  decrease.  For  analysis  of  the  data  in  Table  1,  this  establishes  a  multi-sample  situation 
with  possible  heterogeneity  of  variance  between  samples,  and  where  variance  stabilizing 
transformations  are  inappropriate  since  both  difference  in  location  and  dispersion  is  relevant 
to  inference  about  the  penetrator  design.  This  effectively  removes  from  consideration 
classical  analysis  of  variance  procedures  for  analysis  of  these  data. 
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Figure  3.  Spin  Rate  Data 


A  randomization  approach  to  the  data  analysis 

Figure  3  suggests  that  the  fin  reconfigurations  had  the  intended  effect -reducing  the  spin 
rate  and  increasing  stability.  To  pursue  quantitative  support  for  this  observation  we  will 
appeal  to  a  randomization  argument,  choosing  as  a  null  hypothesis  that  the  two  fin  redesigns 
are  ineffectual  and  provide  no  improvement  over  the  initial  design.  If  the  null  hypothesis  is 
true,  then  the  categorical  labels:  design-0,  redesign- 1,  and  redesign-2,  are  completely 
arbitrary,  and  the  eleven  observations  could  be  randomly  assigned  to  the  columns  of  Table  1 

(retaining  the  same  number  of  observations  per  column)  without  any  attendant  statistical 
consequences.1 

We  will  consider  restricted  null  hypotheses  in  which  redesign- 1  is  compared  to  control  and 
then  redesign-2  is  compared  to  control,  rather  than  an  omnibus  test.  This  focuses  attention 
on  the  comparisons  of  interest  while  easing  the  overall  computational  burden.  Figure  4 
represents  the  8C3=56  data  configurations  that  are  produced  by  systematic  reassignment  of 
datum  values  within  columns  one  and  two  of  Table  LJ^each  resultant  configuration,  the 
difference  in  location  between  control  and  redesign- 1,  x0-  xlf  is  plotted  on  the  x-axis  and  the 
variance  ratio  control/redesign- 1,  sQ/sv  is  plotted  on  the  y-axis. 


Some  authors  assume  random  assignment  of  homogeneous  experimental  units  to  control  and 
treatment  groups.  We  are  necessarily  in  violation  of  this  assumption,  and  arguably  are  detailing  a 
permutation  test  rather  than  a  randomization  test.  In  either  case,  the  procedure  remains  invariant 
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To  determine  an  observed  significance  jevel  of  the  data  in  Table  1  relative  to  the  data  sets 
generated  by  reassignment,  the  point  (x0-  xp  Sq/Sj)  calculated  from  the  data  in  columns  one 
and  two  of  Table  1  can  be  ranked  against  the  remaining  fifty-five  points.  We  will  specify  a 
naive  procedure  for  ranking  the  ordered  pairs  (x;,  ys)  which  will  suffice  for  these  data,  and 
which  retains  the  structure  of  nonparametric  rank  tests  (Lehmann  [3]).  We  first  rank  the  x- 
coordinates,  assigning  to  the  largest  value  the  rank  1,  the  second  largest  rank  2,  and  so  on. 
We  rank  the  y-coordinates  in  the  same  way.  Finally,  we  sum  the  ranks  assigned  to  the  x-  and 
y-coordinates.  In  case  of  ties,  the  ranks  are  averaged. 

Using  this  procedure,  the  observed  data  (51.50,  3.24)  =  (2,  4)  having  combined  rank  six,  is 
tied  with  two  other  pairs  and  is  assigned  an  overall  rank  of  three.  The  observed  significance 
level  is  then  3/56  =  .054 

We  knew  beforehand  the  two  restricted  hypotheses  of  interest,  and  as  such  might  invoke  a 
planned  comparisons  argument.  But,  we  are  testing  two  of  three  possible  comparisons  and  so 
a  multiple  comparisons  procedure  is  a  more  conservative  approach.  Experimentwise  error 
rate  (Miller  [4])  introduced  through  multiple  comparisons  will  be  controlled  with  the  aid  of 
Fisher’s  modified  least  significant  difference  procedure  (Winer  [7])  which  has  the  desirable 
properties  of  being  both  nonparametric  and  applicable  to  unequal  sample  sizes. 

Suppose  we  specify  an  experimentwise  error  rate  of  a  =  .05  for  comparison  of  the  two 
fin  redesigns  with  the  control.  Adopting  the  obvious  notation  c,  dl,  d2  for  control  and 
redesign,  we  are  interested  in  the  comparisons  c-dl  and  c-d2.  The  observed  significance  level 
is  determined  for  each  of  the  pairwise  comparisons  following  the  randomization  procedure 
outlined  above.  Each  p-value  is  then  multiplied  by  two  (the  number  of  comparisons)  in 
accordance  with  Fisher’s  procedure  to  obtain  an  adjusted  p-value.  The  p-values  and  adjusted 
p-values  for  comparison  of  c-dl  and  c-d2  are  given  in  Table  2. 

Table  2.  Multiple  Comparison  of  Control  and  Two  Treatments 


comparison  c-dl  c-d2 

p-value  .054  .018 

adjusted  .107  .036 

p-value 


The  adjusted  p-value,  .036,  corresponding  to  comparison  of  control  and  redesign-2,  falls 
well  below  the  a' =.05  value  chosen  for  experimentwise  error  rate,  and  reflects  a  statistically 
significant  difference  between  the  two  penetrator  designs.  Comparison  of  control  and 
redesign-1,  with  an  adjusted  p-value  of  .107,  exceeds  a' =.05,  and  does  not  substantiate  a 
claim  of  difference.  These  conclusions,  now  quantified,  remain  consistent  with  the  display  in 
Figure  1. 

Conclusion 

Randomization  procedures  offer  a  viable  approach  to  the  analysis  of  ballistic  data  over  a 
wide  class  of  problems.  Distribution  assumptions  may  be  avoided  and,  of  even  greater 
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importance,  random  samples  of  data  are  not  required.  Small  sample  sizes  while  nevpr 
welcome,  may  be  accommodated  as  well.  p  e  sizes,  wnue  never 

In  statistics,  as  elsewhere,  there  is  no  free  lunch.  The  price  paid  for  randomization  k 

ernirnerat .computatlon*  since  every  problem  requires  a  tailored  solution,  reflected  through  the 

r:qTd  ,0  de,ermine  the  P-values'  However,  ~  of  the  noZuheory 
statistics -t-test,  F-test,  chi-square  test,  etc.-may  only  be  valid  to  the  extent  that  tbZ 
approximate  the  p-values  obtained  from  randomization,  ^ 
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Abstract  The  material  properties  of  kinetic  energy  penetrators  are  studied  in  a  1/4-scale  test 
environment  at  the  Ballistic  Research  Laboratory.  Metallurgists  fire  penetrators  of  various 
material  compositions  into  semi-infinite  steel  blocks  and  record  depths  of  penetration.  Depth 
of  penetration  behaves  approximately  as  a  linear  function  of  velocity,  d(v),  over  the  range  of 
the  four-velocity  design  routinely  employed.  Under  a  common  slopes  assumption,  a  difference 
in  performance  between  penetrators  k  and  /  is  computed  as  d^v)  -  dy(v).  This  difference  is 
determined  graphically,  occasionally  with  the  benefit  of  a  least-squares  fit  to  each  perfor¬ 
mance.  Statements  of  significance  are  not  made  at  present.  In  this  paper,  a  randomization 
test  is  examined  as  a  means  for  providing  analytical  support  for  inference. 

1.  Introduction 

Material  properties  of  kinetic  energy  penetrators  are  compared  at  the  Ballistic  Research 
Laboratory  in  a  1/4-scale  test  environment.  Metallurgists  fire  penetrators  of  various  material 
compositions  into  semi-infinite  steel  blocks  and  record  depths  of  penetration.  Depth  of  pene¬ 
tration  behaves  approximately  as  a  linear  function  of  velocity,  d(v),  over  the  range  of  the 
four-velocity  design  routinely  employed.  Under  a  common  slopes  assumption,  a  difference  in 
performance  between  penetrators  k  and  /  is  computed  as  d^-d/v).  This  difference  is 
determined  graphically,  occasionally  with  the  benefit  of  a  least-squares  fit  to  each  perfor¬ 
mance.  Statements  of  significance  are  not  made  at  present.  In  this  paper,  a  randomization 
test  is  presented  as  a  means  for  providing  analytical  support  for  inference. 

Inferences  drawn  from  such  experimentation  may  be  considered  the  result  of  meta¬ 
analysis.  Meta-analysis  is  loosely  described  as  the  "integration  of  independent  studies”  in  a 
book  by  Hedges  and  Olkin  [1985].  This  area  has  received  much  recent  attention  in  the  social 
and  biological  sciences,  but  in  the  physical  and  engineering  sciences  it  has  received  little 
notice  with  the  exception  of  a  few  historical  papers  (e.g..  Tippet  [1931]  and  Fisher  [1932])  that 
have  been  classified  in  retrospect  as  meta-analyses.  The  independent-studies  quality  of  the 
aforementioned  problem  stems  from  the  combination  of  data  sets  gathered  at  different  times 
(often  different  years)  and  by  different  experimenters.  This  fact,  practically  speaking,  invali¬ 
dates  a  necessary  assumption  for  normal  theory  analyses,  namely  the  belief  that  the  subjects 
for  the  combined  data  set  are  the  result  of  a  random  sample. 1  Taylor  and  Bodt  [1991]  recom¬ 
mend  surmounting  this  problem  through  the  use  of  randomization  tests  and  demonstrate 
applicability  of  this  methodology  to  significance  testing  with  ballistic  data. 

In  an  ideal  situation  one  would  design  a  multiyear  experiment  where  random  sampling  did  occur,  but  the  the  obstacles  are  so  formid¬ 
able  in  this  testing  environment  that  it  is  not  done. 
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The  purpose  of  this  paper  is  to  introduce  a  randomization  test  for  comparing  1/4-scal^P 
etic  energy  penetrators.  A  description  of  the  data  collection  is  followed  by  the  discussion 
of  a  Imear  model  through  which  significance  testing  of  relevant  contrasts  can  be  made  It  is 
hen  demonstrated  how  a  reference  distribution  for  determining  significance  can  be  achieved 
through  randomization.  Application  of  the  procedure  and  discussion  of  the  results  follow. 

2.  The  Data  Collection 

, ,  The  roeasured  response,  d-,  is  the  depth  of  penetration  permitted  by  a  semi-infinite  steel 
block  subjected  to  a  hemi-nose  penetrator  of  material  j,  fired  at  velocity  i.  Semi-infinite 
escribes  the  independence  of  the  penetration  action  to  influences  from  side  and  rear  free 
surfaces  (i.e.,  the  block  is  for  practical  purposes  infinite  with  respect  to  width  and.  depth). 
Hemi-nose  refers  to  the  hemispherical  configuration  of  the  projectile  nose.  Figure  1  shows 
the  cut-away  profile  of  a  semi-infinite  block,  where  the  cut  is  made  along  the  shot  line. 
Depth  of  penetration  is  taken  to  be  the  maximum  normal  distance  between  the  original 
entry-point  surface  and  the  bottom  surface  of  the  hole. 

Depth  of  penetration  from  penetrators  of  several  different  material  compositions  are 
gathered  over  several  velocities.  The  design  structure  suggests  that  the  experimental  units 
are  the  semi-infmite  steel  blocks.  It  is  these  that  are  exposed  to  the  two  treatments,  velocity 
cmd  penetrator  material.  Velocity  is  mcluded  as  a  test  condition  because  it  will  affect  pene- 
tranon  depth.  Penetrator  material  is  the  only  treatment  of  interest -materials  are  to  be  com-A 
pared  for  relative  effectiveness.  Confidence  in  the  assessment  of  relative  performance  is# 
ensured  through  comparison  over  a  range  of  velocities  meaningful  to  the  Army  application 
(i.e.  over  a  typical  ordnance  velocity  range).  A  template  for  the  experiment  is  to  fire  each 
penetrator  (material)  once  at  each  of  the  following  four  nominal  velocities:  1100  m/s,  1300 
m/s  iSOO  m/s,  and J70C » m/s.  Actual  velocities  will  vaiy.  A  design  matrix  overlaid  on  a  com¬ 
bined  data  set  including  different  materials  might  appear  as  Figure  2. 

Other  facets  of  data  collection  influence  the  analysis.  Penetrators  are  tested  in  separate 
experiments,  quite  possibly  over  as  many  as  ten  years  if  the  purpose  is  to  compare  new 
materials  to  an  historical  control.  Small  sample  sizes  with  no  replication  prevail  if  one 
adhwes  to  the  template  for  testing  materials.  There  is  no  random  sampling  from  a  population  * 
of  semi-infmite  blocks  -indeed,  at  the  tune  of  the  first  experiment,  blocks  used  in  later  firings 
may  have  not  yet  been  manufactured.  Even  if  the  sample  were  random,  there  is  no  guarantee 

^  ”°  m31,  "V*  “  that  the  comton  of  approximate  normality  can 

be  afforded  by  the  Central  Limit  Theorem  with  the  sample  sizes  and  replication  considered. 

3.  The  Linear  Model 

A  linear  models  framework  is  presented  in  this  section  to  support  inference  for  this 

SmoSn  £feat  “  not.?verL  For  a  comprehensive,  but  introductory,  treatment,  it  is 
sugeested  the  reader  turn  to  Neter  and  Wasserman  [1974].  The  problem  is  first  described  in 
the  context  of  a  two-factor  factorial  design,  followed  by  a  refinement  in  the  form  of  an 
analysis  of  covariance  model.  A  convenient  regression  form  of  this  model  is  then  used  to  con-# 

struct  meaningful  contrasts,  and  assumptions  required  for  traditional  significance  testing  ofW 
those  contrasts  are  discussed.  * 
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3.1  Factorial  Design 


The  design  matrix  shown  in  Figure  2  and  the  problem  description  suggest  that  a  factorial 
design  may  be  appropriate,  with  penetrator  material  serving  as  the  principle  treatment  under 
study  and  velocity  serving  as  an  additional  design  variable.  The  additive  model  is  expressed 

dij*M+F; +M.  +  eij,  (1) 

where  \l  is  the  common  mean  response,  V.  and  M.  are  the  effects  (shifts  from  that  mean) 
caused  by  the  i  velocity  and  the  j*  material,  respectively,  and  e-  is  the  error  associated  with 

the  (ij)  response.  A  Model-I  stance  is  assumed,  indicating  that  both  material  and  velocity 
be  treated  as  fixed  effects.  J 

Two  facts  render  this  approach  less  than  ideal.  The  first,  stated  in  the  Introduction,  is 
that  experimenters  know  that  velocity  behaves  approximately  linearly  with  penetration  depth. 
Even  further,  experience  has  shown  that  d^(v)  and  d^(v)  are  virtually  parallel  over  the  1100 
m/s  to  1700  m/s  velocity  regime,  hence  the  additivity  assumption  above.  Beyond  this  regime 
the  assumptions  of  linearity  and  parallel  lines  do  not  hold.  The  second  is  that  although  four 
nominal  velocities  are  intended,  the  actual  velocities  tested  often  number  as  many  as  the 
number  of  1/4-scale  rods  fired.  Because  firing  velocity  cannot  be  completely  controlled,  each 
nominal  velocity  actually  encompasses  a  range  of  velocities  close  to  the  nominal.  Figure  3 
illustrates  both  linearity  and  firing  velocity  noise  in  replication  of  some  tungsten  alloy  firings 
at  the  four  nominal  velocities. 


This  additional  information  impacts  the  method  of  analysis.  Taking  advantage  of  linear¬ 
ity  would  save  the  experimenter  degrees  of  freedom  to  apply  in  the  estimation  of  error -more 
efficiency  in  the  model  is  possible.  Left  unconsidered,  firing  velocity  noise  would  increase  the 
estimate  of  response  variability.  In  the  next  section  the  analysis  of  covariance  model  is  sug¬ 
gested,  having  the  advantage  that  both  linearity  and  firing  velocity  variation  can  be  incor¬ 
porated. 

3.2  Analysis  of  Covariance 
3.2.1  Traditional  Model 


The  linear  relationship  between  velocity  and  depth  of  penetration  can  be  made  part  of 
the  linear  model  as  follows.  First,  rewrite  Equation  1  in  terms  of  marginal  means  as 

dij  Ov/x)  +  0*j-/x)  +  (dy-^./ij  +  At),  (2) 

where  the  dot  subscript  means  to  pool  over  that  index  (i.e.,  to  average  based  on  the  sum  in 
the  margin).  Introduce  in  the  model  the  term  /xd/v  to  represent  the  simple  linear  relationship 
between  velocity  and  the  mean  response.  Adding  and  subtracting  nA/  from  the  right  side  of 
Equation  2  and  rearranging  terms  leaves  /v 

dij  =^d/v  +  0y /*)  +  (dij-Md/v-Mj  +  m).  (3) 

Let  Vjj  represent  the  velocity  of  the  (ij),h  penetrator.  The  simplejnear  model  which  regresses 
penetration  depth  on  velocity  can  then  be  expressed  as  n  +  t^-v),  where  7  is  the  slope  of 
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(4) 


the  regression.  Substituting  this  for  /xdyv  in  Equation  3  yields 

dij  -#»' +  (dij-^-7)-^), 
which  the  reader  recognizes  as  the  common  form  of  an  analysis  of  covariance  model. 

Certainly,  the  analysis  of  covariance  model  in  Equation  4  has  appeal  in  that  it  can 
account  for  the  contribution  to  penetration  depth  from  individual  velocities;  whereas,  in  the 
factorial  design  the  contribution  of  nominal  velocities  are  counted  as  being  the  same  regard¬ 
less  of  noise.  Further,  even  if  the  nominal  velocities  were  exactly  achieved,  there  is  advantage 
to  be  gained  in  introducing  the  linearity  information  in  the  model.  In  that  case,  degrees  of 
freedom  for  estimating  error  are  saved.  The  factorial  design  allows  (s-2)  fewer  degrees  of 
freedom  for  error,  where  s  represents  the  number  of  nominal  velocities.  This  follows  directly 
from  the  fact  that  the  factorial  design  requires  s-1  degrees  of  freedom  be  assigned  to  velo¬ 
city;  whereas,  the  simple  linear  regression  needs  only  1  degree  of  freedom  assigned  to  the 
slope  to  account  for  the  influence  of  velocity.  If  the  regression  is  perfect  (fits  exactly  to  the 
mean  response  for  each  nominal  velocity),  the  sum  of  squares  associated  with  error  for  both 
models  are  identical,  leaving  analysis  of  covariance  with  a  decided  advantage.  If  the  regres¬ 
sion  is  not  perfect,  a  tradeoff  is  made  wherein  degrees  of  freedom  for  the  error  term  denomi¬ 
nator  are  gained  at  the  expense  of  the  regression  lack-of-fit  being  added  in  the  numerator.  In 
consideration  of  data  with  a  strong  linear  relationship  like  those  displayed  in  Figure  3,  an 
analysis  of  covariance  approach  would  be  a  more  appropriate  choice  than  the  two-factor  fac¬ 
torial. 

Using  the  analysis  of  covariance  model  to  describe  the  problem  structure,  questions 
regarding  material  comparisons  can  be  answered  through  the  study  of  contrasts.  If  the  exper¬ 
imenter  is  interested  in  the  difference  in  the  effect  of  any  two  materials  k  and  /,  the  contrast 
Mk-Mt  would  be  estimated  and  then  tested  for  significance. 

3.2.2  Regression  Formulation 

It  is  convenient  to  formulate  Equation  4  in  terms  of  a  regression  model.  From  an  appli¬ 
cations  perspective,  the  least-squares  approach  is  more  widely  understood  and  accepted  by 
practitioners.  Moreover,  the  parameters  have  greater  intuitive  appeal,  and  their  meaning 

conforms  to  how  experimenters  at  the  Ballistic  Research  Laboratory  currently  think  of  the 
problem. 

The  change  is  accomplished  easily.  Replace  the  t-level  treatment  factor  with  indicator 
variables  m^.,  k  =  1,2,  •  •  •  t-1,  defined  such  that 

mA  =  1  if  the  observation  is  of  material  k; 

=  0  otherwise. 

2 

Regression  is  also  of  use,  computationally,  when  the  design  matrix  is  unbalanced. 
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The  columns  in  the  regression  design  matrix  corresponding  to  the  indicator  variables  will  be 
mutually  orthogonal.  Thus,  Equation  4  may  be  expressed  in  terms  of  a  regression  model  as 

djj  =  fi»  +  di"V  +  /tynlj2  +  •  •  •  +  A-!™,,,-!  +  A<vij-'r.) +  eij' where  (5) 

/30  =  H  +  Mt, 

Pk-Uk-Mv 
/?t  =  T 

The  coefficients  Pk  k  =  1,2,  •  •  •  t  - 1  represent  the  difference  between  the  effect  of  the  kth  and 
tth  material  (Le.,  tlie  vertical  difference  between  the  regression  lines  d^vj-djv]).  The  desig¬ 
nation  of  the  tth  material  is  arbitrary,  determined  by  how  the  indicator  variables  are  defined. 
In  the  design  matrix  for  the  regression  model,  the  tth  material  would  have  zeros  in  the  sub¬ 
rows  corresponding  to  the  t-1  indicator  variables.  The  interpretation  of  the  (3ks  would  be 
most  natural  if  a  reference  group  or  an  historical  control  was  denoted  the  tth  material.  Other 
comparisons  may  also  be  of  interest.  The  general  contrast  Mk-Ml  k,l  ft  is  obtained  through 
the  difference  Pk-Pr 

In  this  section  the  treatment  effects  were  expressed  in  the  context  of  a  regression  formu¬ 
lation  of  the  analysis  of  covariance  model.  Estimation  of  these  effects  can  be  accomplished 
after  first  determining  the  least  squares  estimate  of  the  coefficient  vector.  The  next  step -and 
the  main  focus  of  this  effort -is  to  determine  the  significance  of  these  effects.  To  begin,  a 
careful  consideration  of  the  assumptions  is  made. 

3.2.3  Assumptions 

Several  assumptions  are  required  to  support  the  usual  analysis  of  covariance  for  this 
problem.  They  appear  as  follows:  1)  the  regression  slopes  are  nonzero  and  homogeneous 
among  materials,  2)  velocity  is  unaffected  by  material,  3)  velocity  is  precisely  measured,  4) 
model  errors  are  distributed  with  zero  mean  and  common  variance,  and  5)  the  responses  are 
considered  jointly  independent  normal  random  variables.  The  practical  implication  of  4)  and 
5)  together  is  that  penetration  depths  to  be  allowed  by  the  semi-infinite  blocks  constitute  a 
random  sample  from  some  conceptual  normal  population. 

The  first  four  assumptions  are  accepted;  the  last  is  not.  Velocity  obviously  affects  pene¬ 
tration  depth,  and  data  support  the  similar-slopes  claim.  All  test  penetrators  are  identical  in 
dimension;  there  is  no  reason  to  expect  that  velocity  will  be  influenced  by  which  material 
composition  is  being  tested.  Velocity,  though  not  completely  controlled,  is  precisely  measured 
using  an  x-ray  multiflash  system.  As  for  the  last  assumption,  there  is  no  reason  to  expect  that 
penetration  depths  are  normal,  and  because  of  the  individual-study  nature  of  the  experiments, 
they  do  not  constitute  a  random  sample. 

In  Section  4  we  relax  this  last  assumption  to  require  only  that  the  penetration  depths  be 
pairwise  uncorrelated.  With  that  change,  the  least-squares  estimation  of  the  parameters  in 
Equation  5  will  retain  the  usual  properties  of  uniform  minimum  variance  among  linear 
unbiased  estimators  but  without  any  known  distribution  on  which  to  base  tests  of  significance. 
Under  these  revised  model  assumptions,  an  alternative  test  for  significance  is  given. 
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4.  A  Randomization  Test 
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4.2  Description 

Consider  first  H0:/?£  =  0. 3  The  geometrical  interpretation  of  (3k  is  that  it  is  the  vertical 
distance  between  the  parallel  regression  lines  dA(v)  and  dt(v).  This  fact  is  evident  from  Equa¬ 
tion  5.  The  linear  effect  of  velocity  can  be  removed  by  adjusting  the  penetration  depth  values 
for  the  velocities  used  to  achieve  them -the  remaining  difference  among  the  adjusted  values, 
excluding  random  variation,  is  attributable  to  material  and  is  expressed  0k.  This  difference  is 
estimated  as  b*  by  subtracting  the  average  of  the  residuals  resulting  from  material  t  from 
those  of  material  k,  the  residuals  being  computed  relative  to  dt(v)  in  each  case.  Thus,  once  the 
two  groups  of  residuals  are  formed,  we  are  interested  in  the  difference  in  location  between 
them. 

To  determine  if  this  difference  is  significant,  we  need  only  establish  a  reference  distribu¬ 
tion  and  compare  the  observed  difference  to  it.  Under  the  null  hypothesis,  d^v)  and  dt(v)  are 
coincident.  Thus,  the  residuals  computed  after  adjusting  for  the  linear  effect  of  velocity  should 
be  homogeneous.  Therefore,  in  computing  bA,  the  distinction  of  which  residuals  resulted  from 
assignment  (association)  with  material  k  or  material  t  should  make  little  difference.  The 
reference  distribution  is  constructed  by  computing  b^  under  all  possible  assignments  of  resi¬ 
duals  (effectively  ignoring  material  distinction)  to  the  two  materials,  the  cardinality  of  each 
material  set  being  preserved. 4  For  example,  if  material  k  had  five  data  values  and  material  t 
had  four,  there  would  be  (5+4)C5  values  computed  for  b^.  The  p-value  for  the  two-sided  alter¬ 
native  hypothesis  is  simply  the  ratio  of  the  number  of  values  in  the  reference  distribution 
which  equal  or  exceed  in  absolute  value  the  observed  |  bk  |  to  the  total  number  of  combina¬ 
tions,  (5+4)C5. 

Significance  testing -for  the  hypothesis  HQ:Pk-/3t  =  Q  is  achieved  similarly.  Adjust  pene¬ 
tration  values  for  the  linear  effect  of  velocity  and  compute  residuals  in  the  same  manner,  still 
computing  the  residuals  relative  to  dt(v).  The  difference  between  materials  is  estimated  by 
bk  -  bj  and  computed  by  subtracting  the  average  of  the  residuals  resulting  from  material  / 
from  those  of  material  k.  The  reference  distribution  arises  from  computing  bk  -  b;  under  all 
possible  assignments  of  residuals  between  materials  k  and  /. 

Before  turning  to  examples,  some  more  detail  is  required  as  to  how  these  residuals,  rela¬ 
tive  to  dt(v)  are  computed.  From  Equation  5,  the  model  dt(v)  can  be  expressed 

dt(v)=/?o  +  ^t(v*v>  (6) 

(The  indices  has  been  suppressed  to  emphasize  that  this  is  a  model  for  penetration  depth.) 
Both  0Q  and  0t  must  be  estimated.  Begin  with  slope.  Assuming  parallel 

3 

Specifically,  the  null  hypothesis  for  the  randomization  test  is  that  penetration  depth  measurements  are  stochastically  independent  of  the 
pcnetrator  having  been  formed  from  material  k  or  material  /  [Edgington  1987], 

4 

This  rationale  presupposes  random  allocation  of  subjects  to  treatments.  However,  as  pointed  out  by  Edgington  [1987],  random  alloca¬ 
tion  principally  guards  against  undue  influence  resulting  from  between  or  within  subject  variability.  Such  variability  in  the  context  of  semi¬ 
infinite  steel  blocks  is  considered  negligible  relative  to  the  material  differences  under  study. 


61 


penetration-against-velocity  models,  d(v),  the  common  slope  is  taken  as  the  average  within- 
matenals  regression  slope,  bt,  which  can  be  delivered  by  any  regression  subroutine  fitting  the 
regression  expressed  as  Equation  5  in  its  complete  form.  The  estimate  is  computed  as  § 

EE(vij*Vj)(dij-'dj) 

bt =  ~  ~  •  (7) 

EE(vij*v.j)2 

The  ordinate  at  v = v ,  0Q,  is  taken  as  an  adjusted  mean  penetration  depth  for  the  tth  material 
appeanng  as  Equation  8.  ’ 

^t(adj)  =  dt-bt(7t-7).  (8) 


Tliis  too  will  be  delivered  by  a  regression  of  Equation  5  when  zeros  are  used  as  the  values  for 

for  3 Va"ables,m  tb®  data  rows  corresponding  to  the  tth  material.  Using  estimates 
for  0Q  and  0V  the  estimated  model  for  the  t h  material  takes  the  form  6 

*  d.t(adj)  +  bt(vij  *  v).  (9) 

Equation  9  is  merely  the  least-squares  fit  for  the  Ith  material,  taking  into  consideration  the 
common  slope.  The  residuals  for  the  j  material  relative  to  the  tth  material  appear  as 


(10) 


rij(t)  “  dij  ’ 

The  residuals  r..(t)  are  then  manipulated  in  the  manner  described  above. 

5.  Examples 

,  .  .,In,this  sectlonrtJvo  examples  are  discussed.  The  purpose  of  the  first  is  to  provide  a 
rW^  Sy?°^f  °f  h0W~  e  randomization  test  is  performed.  In  that  example,  data  are 
y  °f  the  Second  *  t0  mustrate  Performance  when  data 

T6  ^  u  hen  the  data  CoUection  does  not  exactly  follow  the  template 

discussed  earher.  Data  for  both  examples  were  extracted  from  an  unpublished  manuscript 
provided  by  Mr.  Timothy  Farrand  of  the  Ballistic  Research  Laboratory.  P 

5.1  Example  1 

.  .  .®  4  J^hig  from  ^  firing  of  four  penetrator  (material)  types  against 

S.  °Cfe'  ^  Penetrat.ors  were  manufactured  with  a  common  mass  of  65  g 
and  with  the  length-over-diameter  ratio  (L/D)  equal  to  15.  The  depleted  uranium  (DU) 
penetrators  are  separated  according  to  RockweU  hardness  (Rc).  It  is  apparent  that  the  tem¬ 
plate  for  data  collection  given  in  Figure  2  was  approximately  followed,  save  duplicate  97%- 
tungsten  results  at  1500  m/s  and  no  result  for  Du  Rc=45  at  1100  m/s.  Four  data  points  are 
the  most  recorded  for  any  material.  Data  are  listed  in  Table  1.  P 

„fl.  ^t  be  accomplished  on  the  way  to  significance  testing.  The  first  step  is  the 

“  .  I  £°‘  lT\  CXampIe’  materiaI  1 15  93%  tungsten.  Estimates  for  the  parame- 

^  f  0t  T  rtSUh  Regressing  penetration  depth  on  velocity  and  the  three  indicator 
riables  found  m  Equation  5.  The  values  for  the  indicator  variables  m;jl,  mjj2,  and  mjj3  are 
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Table  1.  Data  Matrix  for  L/D  =  15 


dii  W  vii  (ra/s)  miii  mJi2  m.., 

97%  tungsten 

42.70  1098  1  0  0 

66.80  1304  1  0  0 

78.20  1489  1  0  0 

89.70  1507  1  0  0 

DU  Rc=49 

58.42  1067  0  1.0 

85.34  1314  0  10 

101.09  1481  0  1  0 

115.06  1654  0  1  0 

DU  Rc=45 

78.99  1304  0  0  1 

99.06  1482  0  0  1 

116.33  1660  0  0  1 

93%  tungsten 

39.12  1086  0  0  0 

65.02  1297  0  0  0 

83.31  1500  0  0  0 

105.92  1682  0  0  0 

Table  2.  Residuals  Relative  to  the  tth  Material  for  L/D = 15 


97%  tungsten 

DU  Rc=49 

DURc=45 

93%  tungsten 

-0.29 

18.65 

14.68 

-2.63 

rij(t) 

2.49 

20.00 

16.32 

1.43 

-5.26 

18.46 

15.16 

-1.29 

4.38 

14.52 

2.49 
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shown  in  Table  1.  It  follows  that  0V  p2,  and  03  represent  differences  from  93%  tungsten  (our 
control)  and  97%  tungsten,  DU  Rc=49,  and  DU  Rc=45,  respectively.  The  estimated  pene¬ 
tration  depths  for  material  t  are  given  by 

3ij(t)  =  73.7310  +  0.1035(v~  -  1395) 

and  are  plotted  as  t(t)  in  Figure  5.  Next,  the  residuals  relative  to  the  tth  group,  rij(t),  are  com¬ 
puted  as  dr  -  3ij(t).  In  Figure  6  these  residuals  are  plotted  about  the  horizontal  line,  r»(t)  =  0. 
Table  2  lists  the  residual  values  for  each  material. 

To  determine  significance,  the  r^,  are  permuted  between  the  materials  being  com¬ 
pared.  Consider,  for  example,  the  two  DU  materials.  Their  difference  is  estimated  by  b2  -  b3 
and  takes,  on  the  value  2.514,  the  average  of  the  residuals  of  DU  Rc=49  less  the  average  of 
the  residuals  of  DU  Rc=45.  The  reference  distribution  for  determining  significance  is  con¬ 
structed  by  computing  b2  -  b3  for  each  possible  combination  of  the  residuals.  Figure  7  depicts 
one  such  combination  where  four  residuals  were  reassigned.  In  that  instance  b2  -  b3  =  1.5j5. 
Figure  8  displays  the  reference  distribution  in  the  form  of  a  stem-plot.  The  observed  value 
for  b2  -  b3  is  circled.  There  are  six  distribution  values  which  equal  or  exceed  in  absolute  value 
|  b2  -  b3  |  (denoted  by  bold  type  in  Figure  8),  hence  a  p-value  of  6/35  or  0.171.  Table  3 
includes  the  results  of  each  pairwise  material  comparison.  In  consideration  of  the  data,  all  p- 
values  appear  reasonable  and  act  to  quantify  the  differences  observed. 

5.2  Example  2 

A  second  data  set  is  displayed  in  Figure  9.  Three  65-g  penetrators  were  tested,  each 
with  L/D  =  10.  Unlike  in  the  previous  example,  data  were  not  collected  strictly  according  to 
the  template  in  Figure  2.  They  need  not  be  for  the  randomization  test  to  be  valid.  Also,  the 
distinction  between  groups  do  not  appear  as  great  as  in  Example  5.1.  It  is  in  this  situation 
that  an  explicit  quantification  of  any  differences  is  most  needed  because  it  becomes  even  less 
clear  how  much  observed  difference  is  real  and  how  much  is  attributable  to  chance  variations. 

Table  4  lists  the  results  for  all  pairwise  comparisons  between  materials.  The  increased 
sample  sizes  over  the  previous  example  allows  for  a  finer  resolution  in  the  number  of  refer¬ 
ence  distribution  values.  There  are  12,870  values  comprising  the  reference  distribution  for  bv 
the  estimated  difference  between  97%  tungsten  and  93%  tungsten.  The  p-value  for  the  ran¬ 
domization  test  is  0.192,  meaning  that  the  probability  is  0.192  of  observing  a  value  for  \  at 
least  as  unusual  as  1.4050.  Generally,  such  a  p-value  would  not  be  considered  significant,  sug¬ 
gesting  that  97%  tungsten  and  93%  tungsten  are  performing  similarly  for  L/D  =  10  penetra¬ 
tors. 

A  second  contrast  A  -  P2,  signifying  the  difference  between  97%  tungsten  and  DU,  is 
estimated  to  be  -4.5113.  ^  It  is  not  clear  from  the  examination  of  Figure  9  that  this  constitutes 
a  real  difference  in  performance.  The  randomization  test,  however,  yields  a  p-value  cf  0.0040 
and  provides  solid  justification  for  the  metallurgist’s  claim  that  97%  tungsten  and  DU  materi¬ 
als  are  performing  differently.  Such  a  difference  was  observed  by  Magness  [1990]. 

5No  discussion  in  this  report  is  devoted  to  controlling  the  error  rate  for  multiple  contrasts.  For  more  explanation,  see  Kirk  (1982]. 
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Figure  6.  Residuals  Relative  to  the  tft  Material. 


Figure  7.  One  Possible  Reallocation  of  Residuals 
Between  Materials  2  and  3. 
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Figure  8.  Stem-plot  Representation  of  the  Distribution 
of  the  Test  Statistic,  b2-b3. 


Table  3.  Significance  of  the  Differences  Observed  in  Example  5.1 


_ 1 

L/D  =  15 

Randomization 

Contrast 

Estimate 

#  unusual 

#  permutations  p-value 

01 

0.3298 

60 

70 

0.857 

02 

17.9032 

2 

70 

0.029 

03 

15.3890 

1 

35 

0.029 

01-02 

-17.5734 

2 

70 

0.029 

02-03 

2.5142 

6 

35 

0.171 

01-03 

-15.0592 

1 

35 

0.029 
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Figure  9.  Depth  of  Penetration  for  Three  Materials,  L/D= 10. 


Table  4.  Significance  of  the  Differences  Observed  in  Example  52 


L/D  = 

10 

Randomization 

Contrast 

Estimate 

#  unusual 

#  permutations 

p-value 

h 

1.4050 

2472 

12870 

0.1921 

02 

5.91630 

3 

6435 

0.0005 

01-02 

-4.5113 

26 

6435 

0.0040 

6.  Conclusion 

For  the  testing  of  1/4-scale  kinetic  energy  penetrators  against  semi-infinite  steel  blocks, 
the  technical  considerations  and  the  procedures  addressing  them  are  long  established.  It  is 
the  intent  of  this  effort  to  enhance  the  inferential  process  within  the  presiding  experimental 
structure.  Presently,  once  data  are  collected,  inferences  principally  consist  of  an  engineering 
judgment  as  to  the  meaning  of  an  observed  vertical  gap  between  linear  functions  representing 
the  penetration  performance  of  two  materials.  The  initial  motivation  for  pursuing  this  prob¬ 
lem  was  the  engineer’s  lament  that,  occasionally,  when  his  judgment  was  questioned,  he  had 
little  recourse  but  to  stand  firm  on  his  opinion  forged  from  years  of  experience.  The  linear 
functions  themselves  are  usually  established  subjectively  and  are  considered  parallel  over  the 
range  of  1100  m/s  to  1700  m/s.  Such  subjectivity  does  bring  into  question  the  consistency  of 
the  assessment  process.  An  objective  method  for  fit,  such  as  least  squares,  is  seldom  used, 
and  then  not  in  such  a  way  as  to  incorporate  the  common  slopes  assumption.  Nor  need  it  be 
in  all  instances.  Often,  the  differences  are  so  great  as  to  allow  for  the  approximate  fitting  of 
the  linear  functions  with  no  loss  to  the  outcome,  but  perhaps  equally  as  often  they  are  not 
great,  occurring  when  only  marginal  improvements  are  made  over  an  historical  (control) 
material. 

In  summary,  the  report  identifies  the  experimental  situation  as  being  similar  to  that  in 
which  an  analysis  of  covariance  model  is  usually  employed  and  then  expresses  the  linear 
model  in  a  manner  conforming  to  how  practitioners  currently  view  the  problem,  even  to  the 
extent  of  automatically  incorporating  the  parallel  lines  assumption.  The  report  then  explores 
some  important  problems,  such  as  data  arising  from  independent  studies,  in  implementation 
of  the  classical  method  for  significance  testing  and  recommends  an  alternative  to  surmount 
these  problems  in  the  form  of  a  randomization  test.  This  test  is  implemented  on  two  sets  of 
real  data,  and  its  application  in  the  context  of  those  data  is  demonstrated. 

The  approach  presented  is  an  attempt  at  a  unifying  structure  within  which  inferences  in 
this  environment  can  be  made  both  quantifiable  and  consistent.  The  recommended  procedure 
combines  existing  techniques  such  as  least  squares  with  a  new  application  of  a  randomization 
test  in  determining  the  significance  of  observed  material  differences.  With  this  test  support¬ 
ing,  practitioners  can  make  definitive  statements  as  to  the  statistical  significance  of  material 
differences  observed. 
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Abstract 

A  method  was  proposed  by  Hanson  and  Koopmans  (Annals  of 
Math.  Stat.,  35,  1964)  for  obtaining  conservative  one-sided  tolerance 
limits  for  large  classes  of  distribution  functions.  Let  the  cdf  of  the 
underlying  population  be  denoted  F.  The  Hanson-Koopmans  result 
provides  upper  tolerance  limits  if  -  log(l  -  F)  is  convex  (IFR  distribu¬ 
tions)  and  lower  tolerance  limits  if  -  log(F)  is  convex.  Any  two  order 
statistics  can  be  used  to  obtain  these  limits.  The  method  is  particu¬ 
larly  useful  for  small  samples  for  which  the  nonparametric  tolerance 
limits  do  not  exist. 

The  Hanson-Koopmans  procedure  was  originally  implemented  for 
consecutive  order  statistics.  This  was  apparently  done  for  computa¬ 
tional  simplicity  in  determining  the  weights.  Unfortunately,  this  choice 
results  in  extremely  conservative  limits.  In  this  paper,  we  evaluate  the 
performance  of  the  Hanson-Koopmans  method  for  various  pairs  of  or¬ 
der  statistics  and  suggest  combinations  for  which  the  conservatism  is 
greatly  reduced.  In  addition,  we  suggest  a  substantial  further  reduction 
in  conservatism  for  distributions  with  positive  support. 

An  important  application  of  this  method  in  the  aircraft  industry 
is  to  determining  95%  lower  confidence  limits  on  the  first  and  tenth 
population  percentiles  of  material  strength,  and  this  application  has 
been  the  motivation  for  the  present  study. 


1  Introduction 

In  structural  design,  an  allowable  stress ,  working  stress ,  or  design  al¬ 
lowable  for  a  material  is  the  maximum  stress  at  which  one  can  be  rea- 
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sonably  certain  that  failure  will  not  occur.  For  the  design  of  structures 
for  which  weight  is  not  a  primary  consideration,  allowables  are  typically 
calculated  by  dividing  a  stress  level  at  which  failure  is  known  to  often 
occur  by  a  sufficiently  large  constant  called  a  safety  factor  (e.g.,  Gere 
and  Timoshenko,  1984,  p.29).  The  structure  is  then  designed  so  as  to 
ensure  that  the  sresses  do  not  exceed  the  allowables  for  the  materials. 

This  approach  is  too  conservative  for  many  aircraft  applications, 
however.  Since  weight  is  an  important  consideration  in  aircraft  design, 
this  industry  long  ago  established  two  one-sided  tolerance  limits  to 
supplement  the  use  of  safety  factors  in  determining  allowables.  These 
tolerance  limits  are  a  95%  lower  confidence  limit  on  the  tenth  percentile 
and  a  95%  lower  confidence  limit  on  the  first  percentile  of  the  strength 
distribution  of  a  material.  These  are  referred  to  as  ‘B-basis’  and  ‘A- 
basis’  values,  respectively  (Mil  Handbook  5E,  1987;  Mil  Handbook  17C 
1992). 

In  this  article,  we  discuss  methods  for  determining  one-sided  toler¬ 
ance  limits  nonparametrically.  The  motivation  for  this  study  has  been 
Mil  Handbook  17,  and  therefore  the  lower  tolerance  limits  correspond¬ 
ing  to  ‘A- basis’  and  ‘B-basis’  values  will  be  used  when  it  is  desirable  to 
fix  ideas.  However,  the  methods  to  be  discussed  are  applicable  to  any 
one-sided  tolerance  limits,  and  to  all  sample  sizes  greater  than  one. 

2  Preliminaries 

Let  F  be  the  absolutely  continuous  distribution  function  of  a  continu¬ 
ous  random  variable  X,  and  let  xp  be  the  /9th  quantile  of  F;  that  is, 
F(xp)  =  fi.  Assume  that  we  have  an  it'd  random  sample  {^,}”=1  from 
F ,  and  let  the  order  statistics  of  this  sample  be 

*(i)  <  -^(2)  <  •  •  •  <  X(n).  (1) 

We  will  also  need  to  make  use  of  the  order  statistics  of  standard  uniform 
and  standard  exponential  random  samples  of  size  n,  and  we  define  these 
random  variables  as 


U(\)  <  U( 2)  <  •  •  •  <  U(n)  (2) 

and 

E(i)  <  E( 2)  <  •  •  •  <  F(n),  (3) 
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respectively.  We  will  adopt  the  usual  convention  of  denoting  a  realiza¬ 
tion  of  a  random  variable  by  the  corresponding  lower  case  letter. 

A  (0, 7)  lower  tolerance  limit  is  a  random  variable  Ti  =  Ti(X 1, . . . ,  Xn) 
such  that 

P(T,  <  *1-0)  >  7.  (4) 

A  {0, 7)  upper  tolerance  limit  is  a  random  variable  Tu  =  TU(X\ , . . . ,  X„) 
such  that 

P(Tu  >  xp)  >  7.  .  (5) 

It  is  easy  to  see  that  upper  (lower)  tolerance  limits  are  precisely  upper 
(lower)  confidence  limits  on  xp  or  respectively. 

3  Nonparametric  Limits  Based  on  One  Order  Statis' 
tic 

Let  {Aj}”_i  be  independent,  identically  distributed  random  varibles 
with  continuous  cdf  F(x).  We  will  demonstrate  how  a  sample  order 
statistic  can  sometimes  be  used  as  a  lower  tolerance  limit.  A  similar 
argument  can  be  made  for  upper  tolerance  limits.  The  material  in  this 
section  is  well  known;  a  recent  reference  is  Conover  (1980,  118-121). 

The  probability  that  the  ith  order  statistic,  X^) ,  is  less  than  the 
(1  —  /?)th  quantile  of  F,  x\ -p,  is  easily  seen  to  be 

P(X{{)  <  xx.p)  =  P(F(X(i))  <  F(Xl.p))  =  P(U{ ,)  <  1  -  0),  (6) 

where  £/(,)  is  the  corresponding  order  statistic  from  a  uniform  sample 
on  [0, 1]  of  size  n.  Since 

U(i)  ~  Beta(i,n  —  z  +  1)  (7) 

(e.g.,  Hogg  and  Craig,  1978,  p.  159),  we  have  that 

P(X(i)  <  Xl.p)  =  Beta(l  -  0]  i,  n  -  i  + 1),  (8) 

where  Beta  (t;  V\,  1^)  is  the  Beta  cumulative  with  parameters  17  and 
i/2 .  Let  t0  denote  the  largest  rank  for  which 

P(UM  <i-0)>r  (9) 
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Then  X(io)  is  a  lower  (£,7)  tolerance  limit.  Note  that,  for  any  p  and 
n,  if 

P(U(i)  <  1  -  p)  <  7.,  (10) 

then  a  (£,7)  lower  tolerance  limit  based  on  a  single  order  statistic  does 
not  exist  for  7  =  7.. 

3.1  Limitations  of  Single  Order  Statistic  Tolerance  Limits 

This  highlights  the  first,  and  most  serious,  limitation  of  nonparamet- 
ric  single  order  statistic  limits.  For  a  given  tolerance  limit  (/?,  7)  and 
sample  size  n,  these  limits  need  not  exist. 

A  second  limitation  is  a  consequence  of  the  fact  that,  for  given  ft 
and  n,  there  are  only  n  values  of  7  for  which  tolerance  limits  with  exact 
confidence  are  available,  namely  {7,}”=1,  where,  for  lower  limits, 

7 i  =  li(P,n)  =  Beta(l  -  p,i,n  -  i  +  1).  (11) 

Usually,  the  desired  7  is  not  one  of  these  n  values,  and  one  uses  instead 
the  largest  i  for  which  7,  >  7  (or,  for  upper  limits,  the  smallest  t  for 
which  7 i  >  7).  This  can  result  in  tolerance  limits  that  are  exremely 
conservative,  as  the  following  example  illustrates. 

Consider  the  problem  of  determining  a  lower  tolerance  limit  when 
n  =  44,  P  =  .9  and  7  =  .95.  Since 

P{U(2)  <  .1)  =  Beta(.l;2,43)  =  .943  <  .95  (12) 

and 

-P(^(i)  <  -1)  =  Beta(.l;  1,44)  =  .990  >  .95,  (13) 

■*(»)  provides  the  desired  tolerance  limit.  However,  the  actual  confi¬ 
dence  of  this  tolerance  limit  is  .99  —  substantially  greater  than  7  =  .95. 

A  natural  way  to  circumvent  both  of  these  limitations  is  to  consider 
tolerance  limits  which  interpolate  between  two  order  statistics,  or,  if 
necessary,  extrapolate  beyond  X(i)  or  X(n).  It  is  necessary  then  to 
calculate  probabilities  such  as  F(cX(i)  +  (1  -  c)XU)),  where  c  >  0.  We 
need  to  make  additional  assumptions  on  F  in  order  to  relate  F(cXi  + 
(1  —  c)Xj)  to  F(X(j))  =  {/(,)  and  F(Ar(J))  =  U^y  In  particular,  if  we 
assume  that  -log(F)  is  convex,  then  we  can  determine  lower  limits 
based  on  two  order  statistics,  and  if  we  assume  that  -log(l  -  F)  is 
convex,  then  we  can  find  upper  limits.  This  is  the  approach  taken  by 
Hanson  and  Koopmans  (1964). 
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4  The  Hanson-Koopmans  Theorem 

Hanson  and  Koopmans  (1964)  showed  that,  given  certain  pairs  of  order 
statistics  from  an  iid  sample  of  any  size,  it  is  possible  to  find  a  linear 
combination  of  these  two  random  variables  which  is  (either  an  upper 
or  a  lower)  tolerance  limit  for  any  quantile  /?  and  any  confidence  7. 

The  lower  limits  require  the  assumption  that  —  log(F)  be  a  convex 
function,  and  the  upper  limits  require  the  convexity  of  —  log(l  —  F). 
Both  of  these  classes  are  large  enough  for  one  to  legitimately  regard 
these  tolerance  limits  as  nonparametric.  The  condition  that  —  log(l  — 
F)  is  convex  is  equivalent  to  an  increasing  hazard  rate  assumption. 
Hanson  and  Koopmans  show  that  the  intersection  of  the  class  of  dis¬ 
tributions  for  which  -  log(F)  is  convex  with  the  class  of  distributions 
for  which  —  log(l  —  F )  is  convex  includes  the  Polya  Type  II  distribu¬ 
tions,  which  includes  many  of  the  distributions  which  are  often  used  in 
practice. 

We  will  derive  the  Hanson-Koopmans  lower  limits  first,  and  then 
we  will  use  a  very  similar  argument  to  obtain  upper  limits. 


4.1  Lower  Tolerance  Limits 


Assume  n  >  2  and  let  1  <  i  <  j  <  n.  We  will  consider  lower  tolerance 
limits  of  the  form 


Ti  =  Xu)  +  k(X(i)  -  X0)),  (14) 


where  kt  >  1.  We  make  the  assumption  that  -  log(F)  is  convex.  Note 
that,  if  k[  >  1,  then  Jensen’s  inequality  does  not  hold,  and  we  have,  for 
any  quantile 


P(T,  <  (15) 

=  p[f(r,)  < 

=  PIF(,T,)<1-01 

=  P{-log[F(T,)]  >  -log(l  -/?)} 

>  P{- logins,)]  +  h  log[F(X,y|)]  -  k,  log[F(Jfw)]  >  -  log(l  -  (9)1 
=  Pllog(t/Ol)  -  k, log(t/,j))  +  k,  log (17,.))  <  log(l  -  /*)] 


=  P 
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The  joint  density  of  U(i)  and  U(j)  is  (Hogg  and  Craig,  1978,  p.  160) 

|2g) 

For  fixed  kt  >  1,  we  can  integrate  this  density  over  the  region 

V(ki)  =  {(x,y)|i  <  y  and  y(x/y)k‘  <  1  —  >9},  (17) 

and  thereby  express  (15)  as  the  following  monotone  function  of  ki: 

H W  =  Jv(h)  Fu(i),vU)(x,v)dxdy.  (18) 

As  k,  |  1,  Ti  -♦  X{i)  and  H(k,)  -►  Beta(l  -  fc  i,n  -  i  +  1)  =  7,(0,n). 

As  ki  -►  00,  y(x/y)k>  -+  0  for  all  x  <  y  €  (0,1],  so  H(k,)  -►  1.  So,  for 

“y  *’  j *  an<^  n>  an<^  f°r  any  7  >  7 ,(/?,n),  there  exists  a  k*  such  that 
H(kf)  =  7,  and  hence 

T,  =  Xu)  +  k*(X(i)  -  X(j))  (19) 

is  a  lower  (/?,  7)  tolerance  limit. 

4.2  Upper  Tolerance  Limits 

We  now  sketch  the  corresponding  argument  for  upper  tolerance  limits. 

The  upper  limit,  based  on  X(i)  and  XU)  for  i  <  j,  is  of  the  form 

Tu  =  *(i)  +  ku(X(j)  -  X{i)).  (20) 

We  assume  that  -log(l  -  F)  is  a  convex  function,  and  we  would  like 
to  derermine  ku  so  that,  for  given  0  and  7,  Tu  provides  a  (£,7)  upper 
tolerance  limit.  We  see  that 

P{TU  >  x0) 

=  P[F(Tu)>F(x0)] 

=  P[l  -  F(TU)  <  1  -  0] 

=  P{- log[l  -  F(TU)]  >  _  log(l  -  0))} 

>  P{- log[l  -  F(Jf(0)]  +  K  log[l  -  F(X(i))]  -  ku  logfl  -  F(X{j))]  >  -  log(l  -  0)} 
—  P[-  log(U(R_f+1))  —  ku  log(t/(n_j+1))  +  ku  log(f/(n_1+1))  >  _  log(l  _  0)] 

r  — 


(21) 


=  P 


<1-0 
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(22) 


The  probability  (21)  is  the  following  monotone  function  of  ku  : 

*W  =  Lu  ,FV(n-i+i),uln-i+1)(x,y)dxdy, 

JV\KU) 

where  Fu(l)luU)(x,y),  the  joint  density  of  two  uniform  order  statistics, 
is  given  by  (16),  and  T>(k)  is  defined  in  (17). 


5  Which  Order  Statistics  are  Best? 


The  discussion  of  the  previous  section  follows  from  results  in  Hanson 
and  Koopmans  (1964).  However,  the  Hanson-Koopmans  results  are 
not  widely  known,  and  so  it  has  been  necessary  to  derive  them  again 
in  this  article. 

It  seems  likely  that  the  reason  why  the  very  useful  work  of  Han¬ 
son  and  Koopmans  has  been  virtually  ignored  by  applied  statisticians 
follows  from  their  unfortunate  choice  of  the  ranks  i  and  j.  Probably  be¬ 
cause  the  numerical  double  integrals  required  in  order  to  evaluate  (18) 
represented  substantial  computation  for  the  time,  Hanson  and  Koop¬ 
mans  decided  to  only  consider  consecutive  order  statistics,  that  is  they 
let  j  =  i  +  l.  We  then  have  that 


Fu(i),u(i+i){x,y) 


n! 


t-i 


(i  -  l)!(n  -  i  -  1)! 


(1  -  y) 


n— i— 1 


(23) 


and  the  necessary  calculations  can  be  done  with  the  aid  of  tables  of 
the  gamma  and  incomplete  beta  functions,  and  without  the  need  for 
numerical  integration.  The  tolerance  limits  which  result  from  the  use 
of  consecutive  order  statistics,  as  we  shall  see,  are  usually  conservative 
to  the  point  of  being  useless  for  most  applications. 

Woodward  and  Frawley  (1980)  is  apparently  the  only  article  in  the 
applied  literature  which  builds  on  the  work  of  Hanson  and  Koopmans. 
These  authors  wanted  to  use  the  Hanson-Koopmans  limits  for  data 
which  had  ties,  so  that  the  required  consecutive  order  statistics  were 
not  always  available.  They  noted  that  considering  the  range ,  that  is, 
letting  i  =  1  and  j  —  n,  also  leads  to  a  single  integral  in  (18).  They 
computed  tolerance  limit  factors  for  the  range,  and  provided  tables.  It 
turns  out  that  the  Wordward  and  Frawley  limits  provide  a  substantial 
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improvement  over  the  use  of  consecutive  order  statistics,  although  these 
authors  did  not  comment  on  this  point. 

In  fact,  there  is  reason  to  believe  that,  if  one  uses  closeness  of  the  ex¬ 
pected  value  of  the  tolerance  limit  to  the  relevant  quantile  as  a  criterion, 
then  consecutive  order  statistics  constitute  the  worst  possible  choice  of 
ranks,  at  least  for  the  important  cases  of  (.90, .95)  and  (.99, .95)  lower 
tolerance  limits.  A  more  careful  selection  of  which  *  and  j  to  use  for 
given  /?,  7,  and  n  leads  to  a  dramatic  improvement  over  consecutive 
order  statistics,  as  we  will  show  in  the  next  section. 


6  Exponential  Spacings  and  the  Calculation  of 
H(ki)  and  V(ku) 


The  computational  burden  due  to  the  integral  (18)  is  completely  un¬ 
necessary,  since  the  functions  H  and  V  can  be  obtained  in  closed  form. 
The  reason  for  this  is  that,  if  X{  ~  F  for  any  continuous  F,  then 
it  is  well  known,  and  easy  to  show,  that  -log(Xt)  has  the  standard 
exponential  distribution..  Consequently,  -\og(U{l))  has  the  same  dis¬ 
tribution  as  £(„./+ 1)  and  -log(l  -  U(t))  has  the  same  distribution  as 
£(,),  for  any  /,  where  1  <  /  <  n,  and  the  E{l)  are  order  statistics  from 
a  standard  exponential  sample  of  size  n. 

Define  the  spacings  in  a  standard  exponential  sample  as  follows: 

Dt  =  E(t)  -  £(,_!),  (24) 

where  £(0)  =  0.  Then,  by  Theorem  2.5  of  Barlow  and  Proschan  (1981 
p.  59),  we  have  that:  ’ 


1.  The  are  mutually  independent,  and 

2.  P(D,  <t)  =  \-  e-(»-»+i)t 

Let  {£,}"=1  be  an  iid  sample  from  a  standard  exponential  distribu¬ 
tion.  For  1  <  /  <  n,  we  can  write  E(l)  as  a  sum  of  the  spacings,  and 
therefore  as  a  linear  combination  of  { Et}lt=1: 


#=1  J=1  ft  S  rf  1 


(25) 
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Let 


=  (1  —  *,)[-  log(f  (Xj)]  +  *,[-  log(F(Jfi)] 

=  (1  —  fc()£(n->+l)  +  *f£(n-i+l> 

n-j+l  p  n-j+1  p  n— »+l  p 

=  s  ^T+T-1'  5  ^  +  t‘  £  5TT7+7 


"T^1  £s  ,  £s 

S  "  -  5  +  1  +  '  ,=S+2  n  -  5  +  1 


so  that 


H(k,)  =  P[T,>-  log(l -/?)].  (27) 

Similarly  define 

fu  =  (1  -  *«)[-  log(l  -  e(X(i))]  +  fcu[-  log(l  —  F(X(j)))}  (28) 
=  (1  —  fcu  )E(i)  +  KE(j) 

_  y* _ El  k  y _ —  i-  k  T _ — _ 

£tn-«  +  l  uh”-s  + 1  Uhi”-s  + 1 


-  v  ^  i  k  y 

~  + 1  \hun~s  +  l 


so  that 


V(*,)-P[f„>-log(l -/»)]. 


Ys'EKEi  (31) 

«=1 

where  the  {f?,}|=1  are  iid  standard  exponential  random  variables,  if  the 
A i  are  distinct,  and  if  Af  >  0  for  all  i,  then  the  cdf  of  Y  is  (Johnson 
and  Kotz,  1970,  p.  222) 


By  substituting  the  coefficients  in  the  linear  combinations  of  the  {E3}™=1 
given  by  (26)  and  (28)  into  (32),  we  obtain  closed  form  expressions  for 
the  probabilities  H(kt)  and  V(ku),  respectively. 
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7  An  Example:  Determining  a  Good  ki  for  a 
(.99, .95)  Lower  Tolerance  Limit  With  n  =  30. 

The  first  order  statistic,  provides  a  nonparametric  (.99, .95)  lower 
tolerance  limit  for  n  >  299.  We  will  investigate  the  tolerance  limit  TJ 
given  by  (14)  when  n  =  30.  We  introduce  the  more  elaborate  notation 
and  T{'3  in  order  to  emphasize  the  dependence  on  the  ranks  of 
the  data  values  used  in  the  tolerance  limits.  Corresponding  to  each 
*  ^  i  —  30  there  is  a  k]'3  which  provides  the  desired  lower  tolerance 
limit.  Since  V(fc/)  is  available  in  closed  form,  it  is  straightforward  to 
find  these  k ^  values  numerically.  We  will  use  as  a  criterion  for  choosing 
i  and  j  the  expected  value  of  Tj’3  under  a  normal  model,  the  objective 
being  to  choose  the  tolerance  limit  for  which  x,B  -  E(Ti)  is  as  small 
as  possible. 

In  Figure  1,  -log |£(7/)|  is  plotted  for  each  i  <  j  <  30.  Since 
the  tolerance  limit  problem  for  the  normal  distribution  is  invariant 
to  a  change  in  location  and/or  scale,  we  can,  without  loss  of  gen¬ 
erality,  consider  a  N(0,1)  population.  For  this  population  we  have 
Xj.p  =  x.oi  =  —2.327.  We  can  see  from  this  plot  that,  according 
to  our  criterion,  the  best  tolerance  limits  have  t  =  1.  In  Figure  2, 
-E{T}J)  is  displayed  for  1  <  j  <  30.  The  Hanson-Koopman’s  limit’ 
with  j  =  t  +  1  =  2,  has  expected  value  -15.11.  The  optimal  limit  7)1’12 
has  expected  value  -4.743  —  a  gain  of  over  ten  standard  deviations  in  the 
expectation.  The  expected  value  of  the  parametric  normal  tolerance 
limit  (e.g.,  Owen,  1968)  is  equal  to  -3.038.  The  Woodward-Frawley 
limit  T 1,30  is  somewhat  more  conservative  than  the  optimal  limit,  but 
it  also  is  much  better  than  7’1’2. 

We  now  make  a  case  for  using  nonparametric  over  parametric  tol¬ 
erance  limits  for  extreme  quantiles  when  the  sample  size  is  small.  We 
consider  a  Weibull  population  with  a  shape  parameter  of  10  and  a  scale 
parameter  of  1,  that  is 


Fx(x)  =  1  -  e  xI°.  (33) 

From  Figure  3,  we  see  that  this  Weibull  population  looks  4roughly 
normal’,  but,  of  course,  any  such  resemblance  breaks  down  in  the  tails. 
Let  n  =  30,  f)  =  .99,  and  7  =  .95.  We  will  compare  the  nonparametric 
lower  tolerance  limit  Tt  ,12  with  the  parametric  normal  limit  x— 3.063s, 
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where  x  and  s  are  the  sample  mean  and  standard  deviation.  The 
consequences  of  using  a  normal  tolerance  limit  procedure  when  the 
population  is  (33)  is  examined  by  taking  500  random  samples  of  size 
30  from  (33)  and  calculating  x  —  3.063s  and  T1,12  for  each.  In  Figure 
4,  a  histogram  of  the  normal  tolerance  limit  values  is  displayed.  Note 
the  exreme  anticonservatism  -  the  actual  confidence  of  about  68%  is 
much  less  than  the  nominal  confidence  of  95%.  However,  in  Figure  5, 
the  tolerance  limit  T'J  has  actual  confidence  of  98%,  which  is  quite 
respectable,  considering  that  we  have  only  one  tenth  the  data  that 
would  be  required  for  the  usual  single  order  statistic  nonparametric 
limit. 

8  Log  Transformation  for  Data  With  Positive  Sup¬ 
port 

If  the  random  variables  have  positive  support,  then  a  positive 

tolerance  limit  is  desirable.  By  taking  logarithms  of  the  data,  calcu¬ 
lating  Ti  or  Tu,  and  exponentiating  the  result,  we  obtain  the  lower  and 
upper  limits 


Li  —  Xj(x{/xj)  1 

(34) 

Lu  =  Xi(xj/xi )*“, 

(35) 

respectively.  Although  these  tolerance  limits  are  valid  for  classes  of 
distributions  more  narrow  than  the  classes  corresponding  to  7/  and  Tu, 
this  is  not  likely  to  be  a  problem  in  practice,  and  the  limits  (34)  and 
(35)  can  be  substantially  closer  to  the  quantile  in  expectation  than 
(14)  and  (20).  Still,  the  transformation  remains  ad-hoc,  pending  fur¬ 
ther  investigation.  In  Figure  6,  (34)  was  used  instead  of  (14)  for  the 
500  simulated  datasets,  with  some  reduction  in  conservatism  of  the 
tolerance  limit. 

9  Two  Order  Statistic  Tolerance  Limits  for  All 
Sample  Sizes 

On  the  basis  of  extensive  computation  for  the  cases  of  (.90,  .95)  and 
(.99,  .95)  lower  tolerance  limits,  we  conjecture  that  the  ‘best’  two  order 
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statistic  lower  tolerance  limit  for  any  sample  size  n  >  1,  any  0,  and 
any  7  is  T/0J,  where  to  —  1  is  the  rank  corresponding  to  the  single 
order  statistic  limit,  where  we  adopt  the  convention  of  associating  the 
rank  0  to  situations  where  a  single  order  statistic  limit  does  not  exist. 
Different  criteria  for  what  is  best  will  result  in  different  choices  of  j, 
but  most  reasonable  criteria  will  probably  result  in  a  j  for  which  j  -  i0 
is  fairly  large,  and  the  precise  j  selected  is  of  secondary  importance. 

To  fix  ideas,  we  adopt  here  the  expected  value  under  a  normal  model 
as  a  criterion,  as  discussed  in  previous  sections.  For  this  criterion,  if 
we  fix  /?  and  7,  choose  i  =  t0(n,  /?,  7),  and  find  the  optimal  j,  then  we 
can  regard  1/  to  be  a  function  of  n  alone.  If  n  is  the  smallest  value 
of  n  for  which  a  certain  order  statistic  provides  a  single  order  statistic 
nonparametric  limit,  then  kt(h)  will  equal  1. 

The  values  fc/(n)  of  the  previous  paragraph  provide  two  order  statis¬ 
tic  tolerance  limits  which  reduce  to  single  order  statistic  limits  at  cer¬ 
tain  sample  sizes,  and  which  are  less  conservative  then  the  single  order 
statistic  limits  in  between  these  sample  sizes.  We  illustrate  this  idea 
in  Figure  7,  for  the  case  ft  =  .90  and  7  =  .95.  For  this  case,  the  first 
order  statistic  provides  a  nonparametric  lower  limit  for  n  =  29  and 
the  second  order  statistic  provides  a  limit  when  n  =  46.  In  between, 
the  first  order  statistic  provides  a  tolerance  limit  which  increases  in 
conservatism  as  n  increases  from  30  to  45.  The  ‘optimal’  two  order 
statistic  limit,  however,  equals  the  first  order  statistic  when  n  =  29, 
equals  the  second  order  statistic  when  n  =  46,  and  steadily  increases 
in  expected  value  at  intermediate  sample  sizes.  The  expectation  of  a 
lower  tolerance  limit  should  increase  monotonically  with  n.  This  is  an 
appealing  characteristic  of  the  two  order  statistic  limit,  of  this  and  the 
previous  section,  which  is  not  shared  by  the  usual  nonparametric  limit. 

A  similar  argument  can  probably  be  made  for  upper  limits  and  for 
other  percentiles  and  confidences  than  (.90,  .95)  and  (.99,  95),  and  this 
will  be  a  subject  of  future  work 

10  Tables 

A  short  table  for  lower  (.90,  .95)  tolerance  limits  is  provided  in  Ap¬ 
pendix  A,  and  a  FORTRAN  subroutine  for  determining  H(k,)  is  given 
in  Appendix  B.  Additional  tables  can  easily  be  created.  However,  if 


one  wishes  to  use  two  order  statistic  tolerance  limit  factors  beyond  the 
cases  covered  by  this  table,  and  if  one  does  not  want  to  compute  the 
k)0,\  the  tables  in  Woodward  and  Frawley  (1980)  can  be  used. 

11  Conclusion 

The  main  point  of  this  article  is  that  the  commonly  held  notion  that 
one-sided  nonparametric  tolerance  limits  do  not  exist  for  certain  /?,  7, 
and  n  is  misleading.  We  have  shown  that  useful  one-sided  tolerance 
limits  involving  two  order  statistics  can  be  obtained  for  any  situation, 
and  that  these  limits  are  valid  over  large  classes  of  distributions. 

In  this  article,  we  also  discuss  the  usefulness  of  a  log  transformation, 
the  choice  of  which  order  statistics  to  use,  and  the  idea  of  using  the  pro¬ 
posed  limits  for  all  sample  sizes  -  not  merely  for  those  cases  for  which 
the  usual  nonparametric  method  is  not  available.  These  discussions  are 
somewhat  ad-hoc,  and  considerable  work  remains  to  be  done.  But  the 
potential  usefulness  of  the  proposed  limits  is  clear,  and  one  should  not 
hesitate  to  make  use  of  them  even  before  all  of  the  theoretical  details 
are  worked  out. 
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Optimizing  the  method  with  respcet  to  normal(0,1) 
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I  and  Weibull  densities 
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Tolerance  limit  using  normal  method 


HK(1 ,12)  tolerance  limits 


Tolerance  limit  using  HK(1,12)  method 


HK(1,12)  tolerance  limitgrt(Mljra!is.) 
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Tolerance  limit  using  HK(1,12)  method 


Sample  Size 


A  Factors  k{  for  (.9 


0,  .95)  Lower  Tc 


n 

i 

j 

k''* 

2 

i 

2 

35.177 

3 

i 

3 

7.859 

4 

i 

4 

4.505 

5 

i 

4 

4.101 

6 

i 

5 

3.064 

7 

i 

5 

2.858 

8 

i 

6 

2.382 

9 

i 

6 

2.253 

10 

i 

6 

2.137 

11 

1 

7 

1.897 

12 

i 

7 

1.814 

13 

i 

7 

1.738 

14 

i 

8 

1.599 

15 

1 

8 

1.540 

16 

i 

8 

1.485 

17 

i 

8 

1.434 

18 

i 

9 

1.354 

19 

i 

9 

1.311 

20 

i 

10 

1.253 

21 

i 

10 

1.218 

22 

i 

10 

1.184 

23 

i 

11 

1.143 

24 

1.114 

25 

1.087 

26 

1.060 

27 

1.035 

28 

i 

12 

1.010 

29 

i 

- 

1 

30 

2 

12 

1.373 

31 

2 

12 

1.344 

32 

2 

12 

1.315 

33 

2 

13 

1.270 

34 

2 

13 

1.245 

35 

2 

13 

1.221 

36 

2 

13 

1.197 

37 

2 

13 

1.174 

38 

2 

13 

1.151 

39 

2 

13 

1.129 

40 

2 

13 

1.108 

41 

2 

14 

1.083 

42 

2 

14 

1.064 

43 

2 

14 

1.045 

44 

2 

14 

1.027 

45 

2 

14 

1.009 

46  2  -  1 


B  A  FORTRAN  Subroutine  for  Determining  H(k\ ) 

The  following  subroutine  determines  H(ki)  using  the  closed  form  expression.  With 
trivial  modifications,  this  subroutine  can  be  adapted  for  calculating  V(4U).  By  using 
this  subroutine  along  with  a  standard  algorithm  for  finding  a  zero  of  a  nonlinear 
function,  a  program  can  be  written  for  determining  the  constants  ki  and  ku. 

double  precision  function  hk(p,  n,  i,  j,  xk) 

parameter  (nnax  =  1000) 

implicit  double  precision  (a-h,  o-z) 

dimension  a(0:nmax),  b(0:nmax),  s(0:nmax) 

a(0)  «  0 

do  10  m=l,  n-j+1 

a(m)  s  l.d0/(n-m+l) 

10  continue 

do  20  m=n-j+2,  n-i+1 
a(m)  =  xk/(n-m+l) 

20  continue 

do  30  m=0,  n-i+1 
b(m)  =  0 
»(m)  =  1 

do  40  mm^O,  n-i+1 

i*  (mm  .eq.  m)  go  to  40 
r  a  abs  (a(m)-a(mm)) 

■  (»)  *  s(m)*(a(n)-a(am))/r 
b(m)  a  b(m)  +log(r) 

40  continue 

30  continue 

c 

q  =  -log(p) 
hk  =  1 

do  60  m=l,  n-i+1 
r  *  abs  (a(m)) 

si  =  a(a)/r 

con  =  s(m)  esl  *exp( (n-i+1) *log(r)  -b(m)-q/a(m)) 

hk  a  hk  -con 

60  continue 

return 
end 
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ROBUST  STATISTICAL  DECISIONS 
(AN  EMPIRICAL  INVESTIGATION) 

EUGENE  DUTOIT 
U.S.  ARMY  INFANTRY  SCHOOL 
FORT  BENNING,  GEORGIA 


ABSTRACT:  THE  PURPOSE  OF  THIS  STUDY  WAS  TO  DETERMINE,  IN  AN  EMPIRICAL 
WAY,  THE  ROBUSTNESS  OF  DATA  BASED  STATISTICAL  DECISIONS  WHEN  SEVERAL 
REASONABLE,  BUT  NOT  EXACT,  ANALYSES  METHODS  WERE  USED  ON  THESE  DATA. 
THE  AUTHOR  PRESENTED  THE  EMERGING  RESULTS  OF  THIS  EMPIRICAL  STUDY 
THROUGH  THE  ANALYIS  OF  SEVERAL  DATA  SETS.  AT  LEAST  TWO  METHODS  OF 
HYPOTHESIS  TESTING  WERE  APPLIED  TO  EACH  DATA  SET  AND  THE 
PROBABILITIES  OF  REJECTING  THE  NULL  HYPOTHESIS  WERE  COMPUTED  AND 
COMPARED.  THESE  DATA  SETS  WERE  OBTAINED  FROM  ACTUAL  FIELD/SIMULATION 
EXPERIMENTS  AND  RANDOMLY  SELECTED  FROM  STATISTICAL  TEXTS.  THE 
CONFERENCE  ATTENDEES  WERE  ASKED  TO  SHARE  THEIR  OWN  EXPERIENCES  USING 
REASONABLE  APPROACHES  OF  ANALYSES  ON  THE  SAME  DATA  SETS.  AN  ANALYSIS 
FORM  WAS  HANDED  OUT  TO  ANY  ATTENDEE  WHO  MIGHT  LIKE  TO  COLLABORATE. 

THE  FOCUS  OF  THIS  STUDY  IS  ON  HYPOTHESIS  TESTING,  NOT 
ESTIMATION.  IT  IS  IMPORTANT  TO  EMPHASIZE  THAT  THE  SIGNIFICANCE  LEVELS 
THAT  ARE  OFTEN  USED  IN  THE  WEAPONS  ACQUISITION  PROCESS  APPROACH  107.. 
RELATIVELY  SMALL  SAMPLE  SIZES  ARE  USED  BECAUSE  THE  ITEMS  ARE 
EXPENSIVE.  THE  TOPIC  OF  THE  PAPER  MAY  SOUND  LIKE  HEARISEY.  THE  STUDY 
IS  NOT  ADVOCATING  WRONG  ANALYSIS.  THE  GOAL  IS  SIMPLY  TO  GET  A 
•' FEELING"  FOR  THE  SENSITIVITY  OF  THE  DECISION  MAKING  PROCESS  WHEN 
ALTERNATIVE  METHODS  OF  ANALYSES  ARE  USED  TO  EXAMINE  THE  DATA.  THIS 
QUEST  HAS  COME  ABOUT  BECAUSE  SOMETIMES  STUDY  REVIEWERS  AND  DECISION 
MAKERS  WILL  ASK  IF  THE  "SO  AND  SO"  METHOD  WAS  USED  IN  THE  STUDY 
BECAUSE  THEY  HEARD  OR  READ  IN  ANOTHER  STUDY  THAT  THIS  ALTERNATIVE 
METHOD  WAS  AN  APPROPRIATE  WAY  TO  GO  OR  WAS  THE  MOST  CONSERVATIVE  OR 
LIBERAL  FOR  THE  STATED  CONDITIONS.  THESE  COMMENTS  MAY,  AND  OFTEN  DO, 
HAVE  SOME  MERIT.  IT  WOULD  BE  NICE  TO  HAVE  SOME  ANSWER  AND  BE  ABLE  TO 
SAY  SOMETHING  LIKE  THIS ...  "ALTHOUGH  I  CANNOT  SAY  THAT  WE  USED  YOUR 
SPECIFIC  METHOD  TO  ANALYZE  THESE  DATA  WE  DID  USE  AT  LEAST  TWO 
APPROACHES  TO  THE  DATA  ANALYSIS.  WE  HAVE  SOME  HISTORICAL  BASIS  TO  SAY 
THAT  THE  DECISION  THAT  WOULD  HAVE  RESULTED  IF  WE  HAD  USED  AN 
ALTERNATIVE  METHOD  WOULD  HAVE  NOT  (OR  HAVE)  BEEN  DIFFERENT  "  IT  IS 
IMPORTANT  TO  POINT  OUT  TO  THE  READER  THAT  THIS  PAPER  CONTAINS  NO 
CLASSIFIED  WEAPONS  DATA.  THE  EXAMPLES  THAT  ARE  GIVEN  IN  THESE 
PARAGRAPHS  USE  DATA  OBTAINED  FROM  TEXTBOOKS  OR  GRADUATE  STUDENT 
RESEARCH. 

FIGURE  1  IS  THE  BASIC  "DATA  COLLECTOR"  FOR  THIS  EMPIRICAL 
STUDY.  THIS  IS  THE  FORM  THAT  WAS  DISTRIBUTED  AT  THE  ARMY  DESIGN  OF 
l XPERIMENTS  CONFERENCE  IN  ORDER  TO  GET  ADDITIONAL  INPUT  FROM  SOME  OF 
THE  ATTENDEES.  THE  FORM  SHOULD  BE  SELF-EXPLANITORY .  THERE  IS  SPACE 
f  OR  A  BRIEF  DESCRIPTION  OF  THE  PROBLEM  AND  THE  VARIABLES  THAT  ARE 
BEING  COMPARED  IN  THE  STUDY.  THESE  ARE  THE  VARIABLES  THAT  ARE 
IMPORTANT  IN  THE  STATISTICAL  DECISION.  THERE  IS  A  SPACE  TO  ENTER  THE 
VALUE  OF  "P"  FOR  EACH  STATISTICAL  METHOD  USED  FOR  HYPOTHESIS 
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TESTING. THE  DECISION  INDICATES  EITHER  A  STATISTICALLY  SIGNIFICANT 
DIFFERENCE  (SD)  OR  NO  SIGNIFICANT  DIFFERENCE  (NSD).  THE  AUTHORS' 
COMPLETE  MAILING  ADDRESS,  TELEPHONE  NUMBERS  AND  FAX  NUMBER  APPEAR  ON 
THE  BOTTOM  OF  THE  FORM. 

„  FIBURE  2  GIVES  A  SUMMARY  OF  SOME  OF  THE  STATISTICAL  DECISIONS 
THAT  CAN  BE  PRESENTED  IN  THESE  PROCEEDINGS.  METHOD  1  IS  THE  PREFERRED 
TECHNIQUE  BASED  ON  THE  UNDERLYING  DISTRIBUTIONS  OF  THE  DATA.  METHOD  2 
IS  A  REASONABLE  ALTERNATIVE  METHOD  (USUALLY  NONPARAMETRIC )  THAT  IS 

2?5?ALLY  D0NE  AS  A  MATTER  0F  COURSE.  THIS  CONVENTION  HOLDS  FOR  THE 
FIRST  THREE  STUDIES  LISTED  IN  FIGURE  2.  THE  FLEITAS  STUDY  HAD  A  mtv 
OF  PARAMETRIC  AND  NONPARAMETRIC  TECHNIQUES  AS  THE  PREFERRED  METHOD  1. 
THIS  FIGURE  SHOWS  THAT  THERE  IS  A  HIGH  DEGREE  OF  AGREEMENT  BETWEEN 
THE  "P"  VALUES  FOR  THE  ALTERNATIVE  STATISTICAL  METHODS.  THEREFORE 
THERE  IS  A  CORRESPONDING  DEGREE  OF  AGREEMENT  IN  THE  RESULTANT 
STATISTICAL  DECISIONS.  THIS  SMALL  SAMPLE  OF  RESULTS  WOULD  INDICATE 
THAT  THESE  STATISTICAL  DECISIONS  ARE  FAIRLY  ROBUST  WITH  RESPECT  TO 

DATA  SETS  (  AND  TESTING  T0  THE  107.  LEVEL  OF  SIGNIFICANCE  )  .  IN  A 
SENSE  I  WILL  “FLING  DOWN  THE  GAUNTLET"  AND  CAUTIOUSLY  AND  TENTATIVELY 
SAY  THAT  THE  REASONABLE  ALTERNATIVE  METHOD  WILL  PROVIDE  THE  DECISION 
MAKER  A  BASIS  FOR  ARRIVING  AT  A  CONSISTENT  CONCLUSION. 

I  WOULD  APPRECIATE  YOUR  THOUGHTS  AND  COMMENTS  ABOUT  THIS 

lF  Y0U  C0ULD  0R  W0ULD  LIKE  T0  PROVIDE  INSIGHTS,  THOUGHTS, 
COMMENTS  AND  EXAMPLES  CONCERNING  ALTERNATIVE  METHODS  OF  HYPOTHESIS 

BE  DELIGHTED  TO  HEAR  FROM  YOU.  YOU  CAN  REACH  ME  USING 
THE  INFORMATION  PROVIDED  AT  THE  BOTTOM  OF  FIGURE  1. 
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PROBLEM  DESCRIPTION 


FIGURE  1 


VARIABLES  COMPARED 


“P  VALUE"  DECISION  COMMENTS 

PREFERRED  METHOD (ALT  #  1) 


ALTERNATIVE  METHOD (ALT  2) 


ALTERNATIVE  METHOD (ALT  3) 


PLEASE  SEND  THIS  COMPLETED  FORM  TO 
COMMANDANT 

U  S  ARMY  INFANTRY  SCHOOL 
ATTN:  ATSH  -  CDC  -  0  (DUTOIT) 

FORT  BENNING  GA  31905-5400 

(404)  545-3165/3166 
DSN  835-3165/3166 
FAX*  (404)  545-2517 

THANK  YOU  FOR  YOUR  HELP, 

GENE  DUTOIT 
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FIGURE  2 

SUMMARY  OF  STATISTICAL  DECISIONS 
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16  DIMENSIONS  OF  NONPARAMETRIC  WERE  CLOSE  & 

BEHAVIOR  &  PERSONALITY  (T-TEST  &  MANN-  THE  DECISIONS 

COMPARISONS  OF  TWO  WHITNEY)  MATCHED  15/10 

GROUPS  TIMES 
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THE  EFFECT  OF  SAMPLE  SIZE  ON  THE  VARJADllTY  OF  SAMPLE  MATERIAL 

PROPETY  VALUES 

Bernard  Harris 

1.  Introdutetion  and  Summary.  In  certifying  materials  for  aircraft  construction,  a  commonly  used 
certification  criterion  is  called  a  statistically  based  material  property  value.  For  the  spec.fic  appli¬ 
cation  being  considered,  the  tests  arc  both  destructive  and  expensive  to  cany  out.  Consequently, 
very  small  samples  are  usually  employed.  Therefore,  it  is  desirable  to  study  the  small  sample  behav¬ 
ior  of  these  statistically  based  material  property  values.  In  the  usual  statical  terminology  (rather 
than  the  terminology  of  materials  testing  used  above),  a  statistically  based  material  property  value 
is  a  lower  tolerance  limit.  Specifically,  a  B-basis  value  is  a  95%  lower  confidence  limit  on  the  tenth 
percentile  of  the  probability  distribution  which  has  been  assumed,  and  an  A-basix  value  is  a  95% 
lower  confidence  limit  on  the  first  percentile. 

In  this  report,  we  assume  that  the  data  is  a  random  wimple  from  a  normal  distribution  with 
unknown  mean  ,i  and  unknown  variance  o2-  It  is  well-known  that  these  statistic  ally  based  ...atonal 
property  values  (tolerance  limits)  can  be  written  as 

r„,o,7(X,s)  =  X-fc(n,o,7)s, 


Wl'“'  _  lA  lux-*? 

nnd  s  =  y  M-  1 

nud  7)  is  i/yfi  times  the  100(1  -  o)th  percentile  of  the  noncentral  ‘-distribution  with  n  -  1 

degrees  of  freedom  nnd  uon-ccntmlity  parameter  v/W'd  -  «).  where  *  is  the  stand, ml  normal 
cumulative  distributiou  fuuction.  For  an  A-basis  value,  o  =  .01;  for  a  D-basis  value,  o  =  .00.  Fur 
notutional  simplicity,  wc  will  deuote  (1)  by  T,  omitting  the  subscripts.  It,  order  to  provide  a  simple 
picture  of  tire  behavior  of  T  as  n  cl., urges,  tj.e  moments  of  T  are  derived  in  the  Appendix  to  this 

report. 

1„  Section  2,  some  numerical  tabulations  arc  presented  which  provide  concrete  illustrations  or  the 
material  given  in  the  Appendix.  These  illustrate  the  variability  of  T  for  sample  sizes  from  2  to 
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500  for  both  A-basis  .aloes  and  B-basis  values  and  the  changes  in  such  variability  with  inrreasi 
sample  sizes. 


tsmg 


The  preparation  of  this  report  was  motivated  by  the  following  considerations.  The  author  lots 
been  a  member  of  a  working  group  concerned  with  the  certification  of  advanced  composites  for 
aircraft  construction.  In  the  course  of  this  activity,  the  author  has  come  to  suspect  that  some  of 
the  materials  engineer,  conclude  that  the  B-basis  (A-basis)  values  are  intrinsic  material  properties 

and  not  random  variables.  It  is  the  purpose  of  this  report  to  demonstrate  that  these  me  random 
viu'iablcs  and  possess  statistical  variability. 

It  is  not  difficult  to  extend  the  calculations  in  Section  2  to  obtain  the  probability  density  function 
of  T.  However,  the  simple  description  of  the  variability  given  by  the  mean,  variance,  and  two-sigum 
limits  for  T  provides  the  user  with  a  suitable  description  of  the  variability  that  wil!  be  encountered. 

2'  Nulnc'  ical  Ulustrati°"s  of  tlle  Variability  of  Statistically  Based  Material  Pronertv  v,l.,„  ... 
section,  we  provide  numerical  illustrations  of  the  mean  and  standard  deviation  of  A-basis  ruul  B- 

basis  values  for  samples  of  size  2,3 . 20,30,35 .  50,00 . 100, 200, 500,  cc  for  normally  rlis- 

tributed  data  from  a  population  with  p  =  200,0  =  10  (values  which  arc  reasonable  for  some 
composite  materials).  Also,  two-sigma  limits  for  such  A-basis  and  B-basis  values  are  given.  The 

h- values  employed  have  been  taken  from  MIL-HDBK-17-1C,  draft  dated  7  September  1000,  Mate- 
rials  Technology  Laboratory,  U.S.  Army. 
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Table  1 


The  Expected  Value,  Standard  Deviation,  and  Two-Sigma  Limits  of  B-basis  Values  in  Samples 
of  Size  n  from  a  Normally  Distributed  Population  with  Mean  n  =  200  and  Standard  Deviation 
a  =  10. 


Sample  Size  n 

Expected  Value 

Standard  Deviation 

Two-Sigma  Limits 

2 

035.787 

124.266 

(  0.000,  284.316) 

3 

145.435 

29.101 

(  87.233,  203.637) 

4 

161.676 

16.941 

(127.794,  195.558) 

5 

167.965 

12.459 

(142.867,  193.063) 

6 

171.387 

10.109 

(151.169,  191.605) 

7 

173.560 

8.646 

(156.268,  190.852) 

8 

175.073 

7.639 

(159.795,  190.351) 

9 

176.213 

6.893 

(162.427,  189.999) 

10 

177.094 

6.318 

(164.458,  189.730) 

11 

177.801 

5.858 

(166.085,  189,517) 

12 

178.386 

5.480 

(167.426,  189.346) 

13 

178.884 

5.162 

(168.560,  189.208) 

14 

179.311 

4.890 

(169.531,  189.091) 

15 

179.676 

4.655 

(170.366,  188.986) 

16 

179.996 

4.450 

(171.096,  188.896) 

17 

180.290 

4.267 

(171.756,  188.824) 

18 

180.548 

4.104 

(172.340,  188.756) 

19 

180.779 

3.958 

(172.863,  188.695) 

20 

180.982 

3.826 

(173.330,  188.634) 

21 

181.177 

3.705 

(173.767,  188.587) 

22 

181.353 

3.594 

(174.173,  188.533) 

23 

181.511 

3.493 

(174.525,  188.492) 

24 

181.660 

3.399 

(174.862,  188.458) 

25 

181.801 

3.312 

(175.177,  188.425) 

30 

182.373 

2.956 

(175.461,  188.285) 

35 

182.797 

2.691 

(177.415,  188.179) 

40 

183.128 

2.484 

(178.160,  188.096) 

45 

183.405 

2.317 

(178.771,  188.039) 

50 

183.624 

2.180 

(179.264,  187.984) 

60 

183.978 

1.962 

(180.054,  187.902) 

70 

184.237 

1.799 

(180.639,  187.835) 

80 

184.449 

1.669 

(181.111,  187.787) 

90 

184.623 

1.563 

(181.497,  187.749) 

100 

184.769 

1.475 

(181.819,  187.719) 

200 

185.518 

1.014 

(183.498,  187.538) 

500 

186.147 

.630 

(184.887,  187.407) 

oo 

187.180 

0.000 

(187.180,  187.180) 
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Table  2 


The  Expected  Value,  Standard  Deviation,  and  Two-sigma  Limits  of  A-basis  Values  in  Sample 

of  Size  n  from  a  Normally  Distributed  Population  with  Mean  /z  =  200  and  Standard  Deviation 

a  =  10. 


Sample  Size  n 

Expected  Value 

2 

-95.967 

3 

106.476 

4 

137.592 

5 

146.035 

6 

151.833 

7 

155.466 

8 

% 

157.983 

9 

160.019 

10 

161.278 

11 

162.430 

12 

163.371 

13 

164.164 

14 

164.832 

15 

165.422 

1C 

165.932 

17 

166.389 

18 

166.792 

19 

167.149 

20 

167.481 

21 

167.775 

22 

168.052 

23 

168.302 

24 

168.534 

25 

168.747 

30 

169.623 

35 

170.269 

40 

170.779 

45 

171.184 

50 

171.526 

60 

172.049 

70 

172.450 

80 

172.756 

DO 

173.016 

100 

173.228 

200 

174.332 

500 

175.263 

oo 

176.740 

Standard  Deviation 

Two-Sigma  Limits 

223.718 

(  6.000, 351.469) 

50.913 

(  4.650, 208.302) 

27.833 

(  81.926,193.258) 

20.093 

(105.849,  186.221) 

16.094 

(119.645,  184.021) 

13.632 

(128.202,  182.730) 

11.949 

(134.085,  181.881) 

10.716 

(138.587,  181.451) 

9.771 

(141.73G,  180.820) 

9.019 

(144.392,  180.4G8) 

8.405 

(146.561,  180.181) 

7.891 

(148.382,  179.946) 

7.45C 

(149.920,  179.744) 

7.078 

(151.266,  179.578) 

6.750 

(152.432,  179.432) 

6.459 

(153.471.  179.307) 

6.201 

(154.390,  179.194) 

5.970 

(155.209,  179.089) 

5.761 

(155.959,  179.003) 

5.571 

(156.633,  178.917) 

5.397 

(157.258,  178.846) 

5.238 

(157.826,  178.778) 

5.091 

(158.352,  178.716) 

4.955 

(158.837,  178.657) 

4.402 

(160.819,  178.427) 

3.994 

(162.281,  178.257) 

3.677 

(163.425,  178.133) 

3.422 

(164.340,  178.028) 

3.212 

(165.102,  177.950) 

2.884 

(166.281,  177.817) 

2.636 

(167.178,  177.722) 

2.442 

(167.872,  177.640) 

2.283 

(168.450,  177.582) 

2.152 

(168.924,  177.532) 

1.469 

(171.394,  177.270) 

.902 

(173.459,  177.067) 

0.000 

(176.740,  176.740) 
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Appendix 


Let  X\,X2 ,...,ATn  be  independent  normally  distributed  random  variables  with  mean  ji  and 
variance  a2.  Let 

Then  A'  and  s2  are  independent,  X  is  normally  distributed  with  mean  y  and  variance  a2/n, 

(n  -  1  )si/o2  has  the  chi-square  distribution  with  n  —  1  degrees  of  freedom. 

To  simplify  notation,  we  set  Tn<an(X,s )  =  T,  and  write 


E(TV)  = 


■>-£(;) 


F(xy(-irwn-j7F(sr-j. 


To  evaluate  £(X)J\  write 


Then, 


X  =  X  —  /i  +  /x. 


=  r 


j  \  /iJ-2r2ro2T(^l) 

2t  )  1lTy/TT 


Also,  since 


where  u  has  the  chi-square  distribution  with  in  =  n  —  1  degrees  of  freedom, 

.  _  f°°  uk!2ok  t~',l2unil2~x 

S  ~  Jo  mk!2  2">/2r(^)  dU 

_  ok2W  T(^) 
m*/ 2  r(f)  ' 
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Substituting  (2)  and  (3)  into  (1),  we  obtain 


„  (  •  )  +  t>  -  7  -  1  , 

E(TV)  =  V  ^  i  L _ _ 2  _ 

i= 0  (n  - 

(  ir  )^V'r(2lfl) 

kn,o,y  2L>  - - ; - - - 

r=0  n 

In  particular,  the  variance  of  T  is  easily  written  as 


aT  -Oj  +  k2al 


(4) 


(5) 
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Small  Sample  Design  Allowables 
From  Paired  Data  Sets 

Donald  M.  Neal  Trevor  D.  Rudalevige  Mark  G.  Vangel 

U.S.  Army  Materials  Technology  Laboratory 
SLCMT-MRS-MM  Arsenal  Street 
Watertown,  Massachusetts  02172-0001 


Abstract 

This  paper  identifies  an  acceptable  statistical  procedure 
for  obtaining  design  allowable  values  from  a  small  set  of 
material  strength  data.  The  allowable  represents  a  ma¬ 
terial  design  number  defined  as  the  95%  lower  confidence 
bound  on  the  specified  percentile  of  the  population  of  ma¬ 
terial  strength  data.  The  percentiles  are  the  first  and  tenth 
for  the  A  and  B  allowables.  The  proposed  method  reduces 
the  penalties  commonly  associated  with  small  sample  allow¬ 
able  computation  by  accurately  maintaining  the  definition 
requirements  and  reducing  variability  in  the  estimate.  Ap¬ 
plication  of  very  small  samples  will  obviously  reduce  costs  in 
testing  and  manufacturing  which  is  the  primary  motivation 
for  this  study. 

In  the  evaluation  process  five  methods  were  considered  for 
computing  the  design  allowable.  Three  of  these  methods  in¬ 
volved  certain  statistical  distribution  assumptions  while  the 
other  two  were  nonparametric  procedures.  The  latter  meth¬ 
ods  introduced  a  pooling  process  such  that  the  small  sample 
was  combined  with  a  larger,  previously  obtained  sample. 

Monte  Carlo  studies  showed  that  the  nonparametric  pro¬ 
cedures  are  the  most  desirable  for  computing  the  design  al¬ 
lowable  value. 
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1  Introduction 

The  A  or  B  statistically  based  design  allowable  value  is  a  statistic  which  is  less  than 
the  first  or'tenth  percentile  of  the  population  with  probability  .95.  That  is,  the  value 
is  a  95%  lower  tolerance  limit  for  the  percentile.  In  Figures  1A  and  IB,  a  graphical 
display  is  shown  for  the  B  allowable  value  probability  density  function  for  sample  sizes 
of  n  equal  to  10  and  50  from  a  standard  normal  population.  The  dotted  vertical  lines 
indicate  the  tenth  percentile  of  the  population  and  the  probability  that  the  allowable 
is  less  than  or  equal  to  the  tenth  percentile  is  .95  for  the  design  allowable  value 
probability  density  function.  The  graphical  display  of  the  allowable  value  density 
functions  show  much  less  dispersion  for  n  =  50  than  for  n  =  10.  Therefore,  small 
samples  will  usually  result  in  lower  allowable  values.  In,1'  2*  3-  4-  s,  various  procedures 
are  described  for  determining  the  statistical  design  allowable  values. 

The  motivation  for  the  work  described  in  this  paper  resulted  fr<Jm  a  need  by  the 
aircraft  industry  to  obtain  a  less  conservative,  statistically  based  material  design  value 
from  a  small  sample  of  composite  material  strength  data.  Here,  ‘conservative’  is  to  be 
interpreted  to  mean  ‘excessively  low’,  which  corresponds  to  a  design  engineer’s  use  of 
the  word.  Statistical  conservatism,  that  is  a  confidence  exceeding  the  nominal  level 
of  .95,  need  not  be  present  for  ‘engineering  conservatism’  to  be  a  problem. 


The  use  of  small  samples  reduces  the  amount  of  testing  and  consequently  the 
manufacturing  cost  of  composite  aircraft  structures.  For  example,  in  order  to  qualify 
t'ACAI6P°S*t^  materia*  to  ke  used  in  the  manufacture  of  a  commercial  aircraft,  the 
FAA,  requires  property  values  for  tension,  compression,  and  shear  tests  subjected 
to  the  enviromental  conditions:  hot-wet,  cold-dry,  and  room  temperature  for  three 
separate  batches  of  material.  In  the  development  of  a  composite  tail  section  by  one 
of  the  major  aircraft  companies  the  cost  of  testing  was  more  than  20  million  dollars. 
In  addition  to  the  cost,  excessively  conservative  allowable  values  can  also  result  in 
an  over-design  situation,  since  the  value  often  provides  information  in  determining  a 
structural  design. 


In  order  to  avoid  the  penalty  associated  with  using  small  samples  in  the  tolerance 
limit  computation,  a  procedure  is  introduced  in  this  paper  involving  pooling  a  large 

1  G«ddeliae*^r9M>0°'t  17B’  Mlteri4U  Technolo8y  L^ratory,  Polymer  Matrix  Composites,  Volume 
*Neal,  D.  M.,  Vangel,  M.  G.,  and  Todt,  F.,  “Determination  of  Statistical  Based  Composite  Material 

£,““db0Ok’  °  A-  D“‘J'  A»'"‘“  “f 

rive^MTL  TR’sas^^A0'’  ‘^UtUtic‘1B“ed  Properties  -  A  Military  Handbook-17  Perspec- 

1990  ’  MTL  9°'5,  U,S'  A  y  M‘terul  Technology  Laboratory,  Watertown,  Massachusetts  02172-0001, 

.  S"d  Spiridigiiozzi’  L  >  “A"  Effident  Method  for  Determining  the  'A*  and  *B>  Design  Al- 

lowables  ,ARO  Report  83-2,  U.S.  Army  Laboratory  Command,  Army  Research  Office,  PO  Box  12211 

Research  Triangle  Park,  North  Carolina  27709-2211,  1983.  1^11’ 

°f  R*duc‘ion  in  the  Development  of  Design  Allowables  for  Com¬ 

il  1-135  ’  19M  M  th^  fof  De8,8“  Allow‘We#  for  Porous  Composites:  2nd  Vol.,  ASTM  STP  1003,  pp. 

•Soderquist,  Joseph,  National  Resource  Specialist  for  Composites  (FAA),  Private  Conversation 
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sample  with  a  smaller  one  in  order  to  obtain  the  allowable  value.  This  is  done  in 
order  to  reduce  the  inherent  variability  that  occurs  from  applying  the  smaller  data 
set  alone. 

I 

In  the  pooling  process  the  larger  data  set  should  be  obtained  from  prior  available 
test  results  or  from  less  expensive  tests.  Ideally,  both  samples  should  be  from  the 
same  material,  test,  and  environmental  conditioning  process.  In  the  pooling  process  it 
is  assumed  that  for  a  given  material  (eg.,  graphite-epoxy)  there  are  similar  classes  of 
failure  modes. 

In  order  to  avoid  the  uncertainties  involved  in  identifying  a  statistical  model  from  a 
small  sample  when  computing  the  allowable  value,  this  paper  introduces  two  nonpara- 
metric  methods  (Ferguson,7  and  the  Modified  Hanson-Koopmans,8)  In  applying  the 
Bayesian  nonparametric  method,  the  larger  set  represents  the  prior  and  the  smaller 
one  the  empirical  data.  In  the  Modified  Hanson-Koopmans  method  an  ordered  array 
of  strength  measurements  is  obtained  from  the  pooled  data  sets.  The  tolerance  limit 
is  determined  from  a  specific  ratio  of  ordered  values  multiplied  by  a  factor  determined 
from  the  sample  size  of  the  pooled  data. 

The  Reduced  Ratio  Method,9  another  procedure  for  computing  small  sample  de¬ 
sign  allowables,  was  also  evaluated.  This  method  is  commonly  used  by  the  aircraft 
industry.  For  example,  a  U.S.  helicopter  company  routinely  uses  this  method  for  ob¬ 
taining  allowables  from  six  specimens  tested  in  tension  at  180°F.  In  the  analysis  an 
additional,  previously  obtained  sample  of  at  least  thirty  room  temperature  tension 
test  results  are  included  in  order  to  reduce  variability  in  the  allowable  estimate. 

2  Determination  of  Allowable  Values 
Nonparametric  Bayesian  Method 

The  nonparametric  Bayesian,7  allowable  value  is  obtained  from  the  following.  Let 
{x,}£  represent  the  current  empirical  data  which  the  allowable  value  is  to  represent 
and  the  larger  prior  data  set  obtained  from  previous  test  results. 

In  the  analysis  the  cummulative  density  function  (CDF)  of  the  prior  (larger  data 
set)  is  written  as 

F0(t)  =  a((—oo,  t])/a(R)  (1) 

where  a(R)  is  the  sample  size  and  a((— oo,  <])  represents  the  number  of  values  less 
than  t  from  {tj}™ .  The  CDF  of  the  smaller  sample  {x,}”  is 

n 

K(t  |  Xi,x2,...,xn)  =  X)M(-oo,*])/n  (2) 

(=i 

7 Ferguson,  T.  S.,  “A  Bayesian  Analysis  of  Some  Nonparametric  Problems”,  Annals  of  Statistics  Vol  1 
No.  2,  209-230,  1973. 

'  Vangel,  M.  G.,  “Lower  Tolerance  Limits  for  Log-Convex  Distributions”,  to  be  published. 

*  Metallic  Materials  and  Elements  for  Aerospace  Vehicle  Structures,  MIL-HDBK-5C,  15  September  1976 
l»p.  9-14. 
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where  n  is  the  sample  size  and  the  sum  over  i  of  SXi(t)  is  equal  to  the  number  of  x, 
values  less  than  or  equal  to  t.  For  example, 

if  i  =  1,  2,  3,  4,5 
and  x  =  6,7,  8,  9,  10 
then  Fn(  5  |  6,.  7,  8,  9,  10)  =  0. 

If*  =  11,  12,  13,  14,  15 
then  Fn{  11  |  6,  7,  8,  9, 10)  =  1. 

The  posterior  distribution  for  {xjf  is  then  written  as 


£>(*  |  *i, *2,  •••»*»)  -  PnF0(t)  +  (  1  -  Pn)Fn(t  |  ij,  ij, xn), 

where 

P  -  Q(fi) 

"  a(i2)  +  n' 

An  example  of  a  Bayes  estimate  for  x  =  1  when 


(3) 

(4) 


t  =  1,  2,  3,  4,  5 
and  x  =  1,  2,  3,  4,  5  is 

Pn  =  PnFo  +  (1  -  Pn)Fn  =  (.5)(.2)  +  (.5)(.2)  =  .2. 


3  Nonparametric  Tolerance  Limit  on  the  Bayesian  Quantile 
Estimate 


The  allowable  value  as  described  previously  is  a  tolerance  limit  on  the  quantile  es¬ 
timates.  The  process  for  obtaining  that  limit  is  shown  in  this  section.  Initailly,  a 
random  sample  F(Y)  of  size  Af  =  a(R)  +  n  is  assumed  independent  of  the  mixture 
of  the  prior  and  empirical  data  sets  shown  in  Equation  3.  By  ordering  a  sample  of 

Yi values,  the  probability  density  function  for  Y{,  1<  i  <  M  can  be  written 
as  a  Beta  distribution, 


ltM  r(uA/)r((i-u)A/)  ’  (5) 

=  ^(^(<))  1  =  with  u  representing  the  CDF  value  corresponding  to 

the  i  ordered  number.  The  tolerance  limit  Ym  for  Yq  is 

P(Y,  >Ym)  =  l-Q  =  P[F(Yg)  >  F(Ym)\  (6) 

where  Yq  is  the  100qth  percentile  of  Y.  Since 


p(y>y)= 

”  Jo  r(uM)r((i-«)Af) 


dz 


(7) 


from  Equation  5,  a  1  -  a  tolerance  limit  on  Yq  can  be  obtained  by  solving  for  u  from 
the  following.  In  the  case  of  the  B  allowable  computation,  a  =  .05  and  q  =  .10 
Equation  7  can  be  written  as  ’ 


/•10  T(M)zuM-1(l  -  x)(»-«)w- 1 

Jo  r(uM)r((i-u)A/)  rfz  =  -95-  (8) 
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See  Table  I  for  tabulation  of  u  and  M  values  that  satisfy  Equation  8. 

Solving  for  tt  in  Equation  8  determines  the  lower  tolerance  limit  of  the  CDF  of 
sample  size  M  where  the  ith  ordered  value  is  equal  to  uM.  Obtaining  a  lower  ordered 
CDF  value  from  Equation  3  that  is  approximately  equal  to  a  u  determines  the  1  -  a 
tolerance  limit  of  the  qih  quantile  of  the  posterior  CDF  for  a  sample  size  M. 

An  example  of  this  would  be  if  there  were  only  prior  data  and  a  B  allowable 

value  is  required  where 

t  =  5,  6,  7,  8,  12,  16,  20,  25,.. .,40  and 
Fo(t)  =  .033,  .066,  .099,. ..,1.0, 

then  M  =  30  and  u  =  .034  from  Table  I.  The  allowable  value  tj  is  determined  from  the 
approximate  solution  of  u  «  F(<)  resulting  in  tx  =  5;  therefore,  the  first  ordered  value 
of  the  prior  represents  the  B  allowable  value,  which  is  the  same  as  the  nonparametric 

quantile  sign  test,10  result,  when  the  sample  size  is  30. 

\ 

4  The  Nonparametric  Modified  Hanson-Koopmans  (MHK) 
Procedure 

A  nonparametric  procedure  (MHK),8  for  estimating  the  allowable  value  is  introduced 
for  any  sample  size  greater  than  or  equal  to  2.  The  method  is  a  modification  of 
Hanson-Koopmans,11  process.  The  modification  has  reduced  the  conservatism  in 
computing  property  values  when  compared  with  the  original  method. 

The  method  involves  the  following.  Let  xx,...,xn  be  the  order  statistics  of  an 
independent  and  identically  distributed  sample  from  a  continuous  distribution  F. 
Assume  that  F  is  log-convex,  that  is  -log  F(x)  is  a  convex  function.  The  class  of  log- 
convex  functions  includes  a  large  enough  group  of  distributions  so  that  the  following 
procedure  involving  log-convex  functions  can  be  considered  nonparametric  for  most 
purposes. 

The  Hanson-Koopmans  lower  tolerance  limits  are  of  the  form 

Tr,  =  kxT  +  (1  -  k)x3,  (9) 

where  r  <  s  and  k  >  1.  The  tolerance  limit  Tr,  can  be  negative,  even  if  the  distribution 
F  is  zero  for  any  negative  values.  A  practical  solution  to  this  problem  is  to  apply  the 
Hanson-Koopmans  approach  to  the  log  of  the  data  x,  that  is, 

V,  =  k  log  xr  +  (1  -  k)  log  (10) 

and  then  obtain  by  exponentiation  the  following 

Tr,  =  eklo*Xr  +  =  *s  (11) 

'“Conover,  W.  J.,  “Practical  Nonparametric  Statistics",  John  Wiley  and  Sons,  1980,  p.  111. 

"Hanson,  D.  L.  and  Koopmans,  L.  H.,  “Tolerance  Limits  for  the  Class  of  Distributions  with  Increasing 
Hazard  Rates”,  Annals  of  Mathematical  Statistics,  Vol.  35,  1964. 
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For  most  distributions  of  interest,  Trt  still  provides  conservative  tolerance  limits 
although  technically  Tri  is  valid  for  a  class  of  distributions  smaller  than  the  log-convex 
class  corresponding  to  Tri. 


In  order  to  determine  the  B  allowable  value,  the  r,  s,  and  k  values  are  obtained 
for  a  given  n  in  Table  II.  Tables  are  also  available  for  the  A  allowable  in  Reference  8. 


5  Allowable  Computation  for  Normal  and  Weibull  Models 

The  following  small,  single  sample,  data  set  allowable  computation  procedures  were 
included  for  comparison  purposes.  This  comparison  is  made  with  respect  to  the 
results  obtained  from  the  other  methods  described  in  this  paper. 


The  normal  PDF  is 

/«(*)  =  *.-<*-»>•/*» 

<T\/2ir 

where  /i  and  a  are  the  mean  and  standard  deviation.  The  normal  allowable  is 


An  =  X-Kas, 


(13) 


where  KA  is  a  factor  obtained  from  Reference  1  and  X  and  s 
and  standard  deviation. 


are  the  sample  mean 


The  Weibull  allowable  computation  is  as  follows.  The  Weibull  PDF  is 

fw(x)  =  (14) 

where  0  and  a  are  the  shape  and  scale  parameters  and  the  Weibull  allowables  can  be 
written  as 

Aw  -  dj-log^)]1^,  (15) 

where  the  fy.  are  tabulated  in  Reference  3  with  d  and  ^  being  the  maximum  like¬ 
lihood  estimates  for  a  and  0  obtained  from  an  algorithem  also  shown  in  Reference 


6  The  Reduced  Ratio  Method  (RRM) 

TJ®Rfduced  jktio  Method,9  determines  an  allowable  value  for  a  smaller  data  set 

ki  •  j  mtr°aucmg  an  indirect  computation  procedure  involving  a  larger,  previously 
obtained  set  of  data,  {Ij}”.  J 


The  first  step  is  to  determine  the  mean  of  L,  that  is  L  =  ±  T™ ,  L  The  won* 

rfifon* 'T  ,obl*j“ins  the  ratios  R*=VJ.  1 W,/I,  I C=VZ  and  the  mea. 

(ft)  of  the  R,  s.  The  reduced  mean,  R*  is  then  obtained  from 


Rm  —  R  —  f(.95)  V/j/ y/ii, 


(16) 
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where  f(.95)  is  the  .95  quantile  of  the  t  distribution  for  n  -  1  degrees  of  freedom  and 
Vr  is  the  standard  deviation  of  the  R,’s.  The  next  step  is  to  compute  an  allowable 
{Lr)  from  the  L  sample  using  some  single  sample  procedure  such  as  described  in  the 
previous  section.  After  obtaining  Lb  the  allowable  Sb  for  S  is  determined  as  follows 


SB  =  LbR *. 


(17) 


7  The  Pooling  Process 

The  pooling  process,  as  previously  mentioned,  requires  combining  a  smaller  data  set 
S  (the  one  represented  by  the  allowable)  with  a  larger  set  L  obtained  from  prior  test 
results.  In  the  MHK  process  the  objective  is  to  represent  S  with  a  combined  data 
set  of  S  and  L  with  sample  size  m  =  ns  +  ni.  In  the  Bayes  method  the  prior  is 
represented  by  L  and  the  empirical  data  by  S. 

If  both  the  means  and  variances  of  S  and  L  are  known  to  be  equal,  then  the  pooling 
process  can  be  easily  justified.  Unfortunately,  this  is  seldom  the  case.  Therefore,  the 
following  transformation  is  suggested.  Let  Z,  and  Si  be  the  data  from  sets  L  and  S 
respectively  and  define  the  new  data  sets  S*  and  L*  by 

s  =  as) 

and 

=  da) 

where  Z  and  5  are  the  data  set  means.  This  procedure  involves  reducing  the  mean  of 

S  and  L  to  a  common  mean  of  zero  for  S“  and  L*.  In  addition,  the  transformed  data 

sets,  S*  and  L",  have  standard  deviations  equal  to  the  CV’s  of  S  and  L.  Schematics 
of  this  transformation  are  shown  in  Appendices  A  and  B. 

It  is  suggested  that  an  equality  of  variance  test  between  S*  and  L*  be  made  in 
order  to  determine  if  an  excessively  large  difference  in  variance  exists.  The  Siegel- 
Tukey  nonparametric  rank  sum  method,12  proved  effective  in  testing  for  equality  of 
variance  although  for  small  samples  (less  than  ten),  the  test  on  equality  of  variance 
will  result  in  a  certain  amount  of  uncertainty. 

8  Allowable  Values  for  S*  from  Pooled  Data 

8.1  Bayes  Solution 

In  the  Bayes  application  let  the  smaller  sample  x  (newly  obtained  data)  of  size  ns 
be  represented  by  the  S*  values  and  the  larger  sample  t  (the  prior)  with  ni  values  by 
L*.  Initially,  u  in  Equation  8  is  obtained  from  Table  I  for  M  equal  to  the  combined 
sample  sizes  of  S*  and  L*  in  order  to  determine  the  allowable  for  S*.  CDF  values  are 

l  J Siegel,  S.  and  Tukey,  J.  W.,  “A  Nonparametric  Sum  of  Ranks  Procedure  for  Relative  Spread  in  Unpaired 
Samples",  Journal  of  American  Statistical  Association,  September,  1960. 
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determined  from  Equation  3  where  t  =  L’  and  x<  =  S’  i  =  l,2,...,n.  Equating  the  CDF 

value  that  corresponds  to  u  determines  the  ordered  ( uM )  value  of  Fn.  Inverting  Fn  so 

that  the  corresponding  ordered  test  result  is  obtained  then  determines  the  allowable 
value  Sg. 

8.2  The  Modified  Hanson-Koopmans  Method 

The  nonparametric,8  solution  for  obtaining  allowable  values  involves  pooling  the  val¬ 
ues  from  S’  and  L’  and  letting  the  combined  ordered  array  of  values  be  x  in  Equation 
11  with  sample  size  n  m  ns  +  nL.  Let  this  value  be  denoted  5^ (in  place  of  fTX  This 
method  is  very  simple  to  apply  yet  provides  results  for  any  sample  size  greater  than 


9  Transformation  Procedure  in  Determining  Allowable 

The  allowable  value  for  S*  is  not  sufficient  since  S  and  L  were  the  original  data  sets 
involved  m  the  analysis  and  their  magnitudes  differ  from  S*  and  L’.  Therefore,  the 
following  transformation  is  required: 


&B  —  SgS. 9S  +  5.95 


(W) 


where  SB  is  the  required  allowable  value  for  the  small  sample  S.  The  S9s  values 
represent  the  lower  95%  confidence  value  for  the  mean  of  the  S  values.  The  purpose 
m  using  5.9s  instead  of  5  is  to  adjust  for  the  variability  in  estimating  the  mean  5 
of  the  smaU  sample  S.  This  variability  in  S  directly  effects  the  computation  of  the 
allowable  Sg.  This  often  results  in  SB  values  being  greater  than  the  p‘*  percentile  of 
the  population  of  S  more  than  5%  of  the  time.  This  is  counter  to  the  requirement  for 
an  allowable  value  as  described  in  the  introduction. 


10  Results  and  Discussions 

10.1  Coverage  Rates  from  MHK,  Bayes,  and  RRM 

If  and  Y  ™*?Verage  fate  results  are  tabulated  from  the  application 

?runwK’iB!i!eSLand  R»M  pr0Cedures'  “  functions  of  the  coefficient  of  variations 
(CV(i))  for  both  the  smal  sample  S  and  the  large  sample  L.  The  coverage  rate 
represrats  the  percent  of  values  less  than  the  10%  pt.  (B  allowable)  or  the  1%  pt.  (A 
allowable)  of  a  population  of  values  representing  the  data  set.  The  data  was  obtained 

by  randomly  selecting  values  from  either  a  normal  or  Weibull  distribution  with  the 
specified  CV  s. 

The  mean  and  standard  deviation  are  ideniified  as:  m(l)  and  s(l)  for  the  larger 
sample  L  and  m(2)  and  s(2)  for  the  smaller  sample  S.  The  sample  sizes  are  usually 

°\7  “  30  an.d  ~  6  for  the  lar«e  and  small  data  sets  respectively.  CV(1)  and 
CV(2)  have  similar  representation  for  the  two  samples. 
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In  Table  III  results  from  randomly  selected  values  obtained  from  normal  distribu¬ 
tions  with  sample  sizes  of  30  and  6  show  that  for  differences  in  the  CV’s  less  than 
20%  an  acceptable  coverage  rate  can  be  obtained  from  all  methods  since  the  rates 
are  greater  than  95%.  The  MHK  and  Bayes  methods  provide  acceptable  results  even 
for  a  60%  difference  in  the  CV  values  although  they  fail  to  obtain  the  desired  95% 
minimum.  The  RRM  coverage  results  with  40%  differences  in  CV’s  fail  to  provide 
acceptable  coverage  as  shown  in  both  the  A  and  B  allowable  computation.  The  A 
allowables  could  not  be  computed  using  the  Bayes  method  since  an  amount  of  data 
much  greater  than  36  would  be  required.  The  A  allowable  tables  for  u  and  M  have 
not  been  computed  because  of  the  excessively  large  data  set  requirements.  When 
CV(1)  =  .12  and  CV(2)  =  .10,  greater  variability  in  L  than  S,  the  coverage  is  much 
greater  than  required,  therefore,  resulting  in  potentially  over  conservative  estimates 
for  the  S  allowable.  This  will  usually  be  the  case  when  CV(1)  >  CV(2). 

The  MHK  and  Bayes  methods’  ability  to  provide  acceptable  coverage  when  the 
CV’s  art  .16  for  the  small  sample  and  .10  for  the  large  sample  shows  that  the  methods 
are  quite  robust  with  respect  to  differences  in  the  spread  of  the  data  sets.  In  actual 
engineering  application  it  is  unlikely  that  the  material  being  considered  in  the  design 
(small  sample)  would  have  a  variability  60%  greater  than  that  of  the  previously  tested, 
similar  material  (large  sample). 

In  Table  IV,  the  small  sample  data  set  was  randomly  selected  from  a  Weibull  distri¬ 
bution  where  the  shape  and  scale  values  were  computed  so  that  they  were  equivalent 
to  the  tabulated  mean  and  CV’s.  The  larger  data  set  was  obtained  from  a  normal 
distribution.  The  results  are  similar  to  those  in  Table  III  for  the  MHK  and  Bayes 
methods.  The  Table  IV  RRM  results  show  a  reduction  in  the  coverage  when  com¬ 
pared  with  those  in  Table  III,  an  example  is  the  78.8%  coverage  for  the  A  allowable 
in  Table  III  compared  to  48%  in  Table  IV  for  differences  in  the  CV’s  of  only  40%. 

These  results  indicate  that  the  RRM  is  sensitive  to  the  statistical  model  assump¬ 
tion  in  representing  the  test  data  while  the  MHK  and  Bayes  methods  are  much  less 
sensitive.  Since  MHK  and  Bayes  are  nonparametric  methods,  this  robustness  to  the 
model  assumption  could  be  expected. 

In  Table  V  data  was  obtained  from  normal  distributions  with  CV’s  of  .10  and 
.16  for  L  and  S  respectively.  The  coverage  percent  and  range  of  allowable  values  are 
tabulated  with  respect  to  increasing  sample  sizes  of  both  L  and  S  for  the  RRM  and 
MHK  procedures.  Results  show  that  increasing  sample  size  for  L  with  constant  small 
sample  size  for  S  of  6  causes  the  RRM  process  to  perform  poorly  since  the  coverage 
is  reduced  from  86.6%-  to  73%.  The  only  advantage  is  the  reduction  in  the  range 
of  the  allowable  from  17  to  14  which  is  not  very  significant.  Increasing  the  sample 
size  of  S  from  6  to  15  also  shows  a  somewhat  unsatisfactory  result  since  a  81%  to 
72.8%  reduction  in  the  coverage  occurs.  These  coverage  reductions  are  the  inherent 
weakness  in  the  method  which  is  vulnerable  to  situations  where  L  has  a  much  smaller 
CV  than  S.  The  range  reduction  from  15  to  10  could  be  considered  an  improvement 
since  there  is  less  spread  in  the  allowable  estimate.  Unfortunately,  this  advantage  is 
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removed  because  of  the  coverage  loss.  This  implies  that  many  more  (much  greater 
then  5%)  allowable  values  will  be  greater  than  the  10%  pt.  of  the  population  of 
material  strength  measurements.  This  situation  could  result  in  an  overly  optimistic 
allowable  value  and  therefore  a  potential  under-design  situation. 

MHK  results  provide  reasonably  acceptable  coverage  for  all  the  combinations  of 
sample  size  for  both  L  and  S.  That  is,  results  show,  at  least  for  the  cases  considered, 
that  the  method  is  robust  to  a  variety  of  sample  sizes  for  both  L  and  S.  The  range  of 
the  allowables  is  affected  by  the  sample  sizes  particularly  for  the  case  MHK(15,6)  vs. 
™“K(30,15).  The  MHK  method  can  provide  a  smaller  range  on  the  allowable  but 
will  not  make  significant  improvements  on  the  coverage  capability  when  the  sample 
sizes  are  increased.  In  the  results  for  MHK(60,6)  and  MHK(15,6)  the  coverage  is  88% 
and  92.8%  showing  that  increasing  the  sample  size  of  L  can  reduce  the  coverage.  This 
is  the  result  of  sample  L’s  increased  influence  in  the  allowable  computation  which  the 
analyst  should  be  aware  of  when  applying  the  MHK  method.  It  is  suggested  that  the 
ratio  of  sample  sizes  n(2)/n(l)  should  not  be  any  smaller  than  .2. 

10.2  A  Comparison  Study:  Single  Sample  Vs.  Two-sample  Allowable 
Computation 

2  through  5  a  comparison  is  made  between  the  multi-sample  methods 
(MHK,  RRM,  and  Bayes)  and  the  single  sample  Weibull  and  normal  methods  with 
respect  to  the  coverage  percentage  and  the  spread  in  the  allowable  estimates.  In 
Figures  2  and  3  results  were  obtained  by  using  a  random  selection  of  data  from  normal 
distributions.  The  N(6)  and  W(6)  designations  represent  results  from  applying  6 
data  values  to  the  Normal  and  Weibull  allowable  computation  procedures.  MHK(36) 
results  are  for  the  Modified  Hanson-Koopmans  method  using  a  single  sample  with  36 
data  values  from  the  S  population  distribution.  CV’s  of  .10  and  .14  are  introduced  for 
L  and  S  in  order  to  represent  a  possible  difference  in  the  spread  of  the  two  data  sets. 
The  ordinate  values  (A)  shown  in  the  figure  represent  the  95‘fc  percentile  value  of 
the  allowable  simulation  results.  Ideally,  the  values  should  be  located  on  the  dotted 
line  for  optimum  coverage.  Values  above  the  line  indicate  that  coverage  has  not 
been  achieved.  Those  below  the  line  provide  the  coverage.  This  can  also  identify  an 
excessively  low  allowable  value.  In  the  second  part  of  the  figure  the  vertical  dotted 
lines  represent  the  spread  in  the  allowable  estimates  (1  to  99  percent  of  all  the  data 
from  the  simulation  results). 

The  Figure  2  results  show  that  the  MHK  and  Bayes  methods  can  provide  an 
almost  optimum  computed  B  allowable.  The  RRM  approach  fails  to  provide  the 
coverage  since  results  show  an  87%  rate.  Normal  distribution  for  single  sample  (S) 
of  6  provided  reasonably  good  coverage  as  expected  since  the  data  was  originally 
obtained  from  a  normal  model.  The  Weibull  results  were  overly  conservative,  possibly, 
because  an  incorrect  model  was  assumed  for  the  data  (normal).  MHK(36)  results  were 
excellent  as  expected  since  the  36  values  applied  to  the  model  were  all  from  the  normal 
distribution  representing  the  data  sample  S. 
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Evaluation  of  the  models’  capabilities  with  respect  to  spread  in  the  allowables 
showed  the  two-sample  methods’  allowable  values  to  have  much  less  variability  than 
those  of  the  single  data  set  methods. 

The  results  in  Figure  3  are  similar  to  those  in  Figure  2  except  that  the  A  allowable 
was  computed.  The  Bayes  method  was  omitted  since  a  very  large  data  set  would 
have  been  required.  A  spread  in  excess  of  50  was  determined  from  applying  the 
single  sample  normal  analysis  with  the  1%  point  showing  an  allowable  of  -12.  This 
result  can  discourage  the  engineers  from  using  statistical  procedures  for  obtaining 
design  allowables.  In  this  case,  the  single  sample  method,  although  statistically  correct, 
provides  a  design  number  that  is  incorrect  from  an  engineering  perspective.  This  result 
has  been  the  primary  motivating  factor  in  the  authors’  examination  of  alternate  small 
sample  procedures.  The  results  from  MHK  and  RRM  show  a  more  reliable  range  of 
values  for  the  allowable. 

In  Figures  4  and  5  random  samples  were  obtained  from  a  NASA  contractor 
report,13  on  composite  material  strength  measurements.  The  figures  identify  the 
names  of  the  companies  that  manufactured  the  material  and  the  number  of  speci¬ 
mens  tested.  In  Figure  4,  the  CV’s  of  .10  and  .13  were  obtained  from  unidirectional 
tension  and  crossply  tension  data.  The  results  show  that  the  MHK  and  Bayes  meth¬ 
ods  are  effective  in  obtaining  a  desirable  allowable  estimate.  The  RRM  results  are 
greater  than  the  10%  point  and  therefore  fail  to  provide  an  acceptable  allowable  es¬ 
timate.  The  normal  and  Weibull  perform  well  in  obtaining  the  proper  coverage  but 
as  shown  previously  the  spread  in  allowables  for  N(6)  and  W(6)  is  much  greater  than 
that  of  the  MHK,  Bayes,  and  RRM  results. 

In  Figure  5,  the  random  samples  for  both  S  and  L  were  obtained  from  230  data 
values  (composite  short  beam  shear  test).  The  results  are  similar  to  those  in  Figure 
4  except  that  the  normal  analysis,  N(6),  fails  to  provide  acceptable  coverage  and  the 
MHK  and  Bayes  allowables  are  more  conservative  (excessive  coverage).  A  relatively 
good  agreement  between  the  coverages  can  be  identifed  by  comparing  MHK(36)  and 
MHK  results.  A  reasonable  correlation  also  exists  for  the  spread  in  the  allowable 
estimates.  This  implies  that  MHK  can  perform  almost  as  well  as  if  36  values  from  S 
were  applied  to  the  MHK  analysis  instead  of  only  6  from  S  and  30  from  L. 

11  Conclusions 

Results  from  this  comparison  study  show  that  the  nonparametric  MHK  method  is 
superior  in  determining  small  sample  design  allowables  when  compared  to  the  the 
results  from  the  other  procedures  evaluated  in  this  paper.  The  allowable  values  ob¬ 
tained  from  the  MHK  method  application  consistently  meet  the  coverage  requirement 
(95%  of  values  less  than  a  specified  percentile  of  the  population  of  all  test  data)  for 
a  relatively  wide  spectrum  of  data  sets.  The  variability  of  the  MHK  values  is  much 

13 Reese,  C.  and  Sorem,  J.  Jr.,  “Statistical  Distribution  of  Mechanical  Properties  for  Three  Graphite-Epoxy 
Material  Systems”,  NASA  Contract  Report  No.  165736,  1981. 


117 


lower  than  that  of  the  values  resulting  from  the  small,  single  sample  normal  or  Weibull 
analysis. 

The  nonparametric  Bayesian  method  provides  acceptable  allowable  values  al¬ 
though  this  method  is  limited  by  the  sample  size  requirements.  This  limitation  pre¬ 
vents  the  method  from  being  as  desirable  as  the  MHK  process.  Another  undesirable 
feature  is  the  complexity  involved  in  applying  the  method. 

The  Reduced  Ratio  Method,  which  is  currently  used  by  the  aircraft  industry, 
is  not  effective  due  to  its  failure  in  providing  the  required  coverage  when  there  are 
relatively  small  differences  between  the  CV’s  of  the  prior  large  data  set  and  the  smaller 
empirical  set  from  which  the  allowable  is  obtained.  Also,  increasing  the  sample  size  of 

empirical  data  and  incorrectly  assuming  statistical  models  for  the  data  sets  prevents 
proper  coverage. 

Application  of  the  small,  single  sample  analysis  (Normal  and  Weibull)  results  in 
extremely  large  variability  in  the  allowable  estimate.  In  addition,  the  methods  fail  to 
provide  acceptable  coverage  when  incorrect  models  are  assumed. 

The  proposed  pooling  process  introduced  in  this  paper  provides  a  desirable  method 
for  combining  the  small  and  large  data  sets  when  there  is  a  difference  in  their  mean 
values.  Application  of  this  process  in  the  MHK  and  Bayesian  analysis  results  in  an 
effective  solution  in  obtaining  economical  allowable  values. 
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Figure  1A 


Tolerance  Limit  for  N(0#1)  Population 


Allowable  Values  /  Coverage  Evaluation  The  Range  of  the  ’B’  Values 
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Figure  4.  Allowables  Versus  Coverage/Range  of  Allowables 


Table  i.  M  and  u  Values  for  Bayesian  Basis  Value  Computation 


j  M 

u 

M 

U 

1 

0.021953 

51 

0.044804 

2 

0.017855 

52 

0.045192 

3 

0.016529 

53 

0.045565 

4 

0.016140 

54 

0.045937 

5 

0.016199 

55 

0.046301 

6 

0.016516 

56 

0.046648 

7 

0.016997 

57 

0.046996 

8 

0.017590 

58 

0.047339 

9 

0.018264 

59 

0.047673 

10 

0.018996 

60 

0.048011 

11 

0.019769 

61 

0.048318 

12 

0.020570 

62 

0.048642 

13 

0.021391 

63 

0.048945 

14 

0.022223 

64 

0.049255 

15 

0.023060 

65 

0.049563 

16 

0.023897 

66 

0.049848 

17 

0.024729 

67 

0.050144 

18 

0.025554 

68 

0.050421 

19 

0.026368 

69 

0.050695 

20 

0.027171 

70 

0.050968 

21 

0.027959 

71 

0.051238 

22 

0.028734 

72 

0.051506 

23 

0.029491 

73 

0.051771 

24 

0.030233 

74 

0.052034 

25 

0.030959 

75 

0.052284 

26 

0.031666 

76 

0.052530 

27 

0.032361 

77 

0.052773 

28 

0.033033 

78 

0.053017 

29 

0.033695 

79 

0.053244 

30 

0.034339 

80 

0.053479 

31 

0.034967 

81 

0.053702 

32 

0.035577 

82 

0.053932 

33 

0.036172 

83 

0.054160 

34 

0.036754 

84 

0.054375 

35 

0.037328 

85 

0.054600 

36 

0.037884 

86 

0.054808 

37 

0.038420 

87 

0.055017 

38 

0.038952 

88 

0.055221 

39 

0.039461 

89 

0.055435 

40 

0.039964 

90 

0.055634 

41 

0.040459 

91 

0.055831 

42 

0.040944 

92 

0.056024 

43 

0.041409 

93 

0.056215 

44 

0.041864 

94 

0.056417 

45 

0.042314 

95 

0.056599 

46 

0.042751 

96 

0.056781 

47 

0.043182 

97 

0.056960 

48 

0.043596 

98 

0.057153 

49 

0.044009 

99 

0.057332 

50 

0.044413 

100 

0.057502 

M 

U 

M 

U 

ioi 

0.057686 

151 

0.064302 

102 

0.057856 

152 

0.064395 

103 

0.058023 

153 

0.064514 

104 

0.058188 

154 

0.064609 

105 

0.058352 

155 

0.064717 

106 

0.058517 

156 

0.064814 

107 

0.058670 

157 

0.064912 

108 

0.058837 

158 

0.065010 

109 

0.059006 

159 

0.065099 

110 

0.059156 

160 

0.065193 

111 

0.059313 

161 

0.065273 

112 

0.059454 

162 

0.065382 

113 

0.059619 

163 

0.065462 

114 

0.059761 

164 

0.065555 

115 

0.059914 

165 

0.065658 

116 

0.060051 

166 

0.065734 

117 

0.060192 

167 

0.065822 

118 

0.060344 

168 

0.065910 

119 

0.060480 

169 

0.065996 

120 

0.060628 

170 

0.066108 

121 

0.060754 

171 

0.066192 

122 

0.060883 

172 

0.066277 

123 

0.061031 

173 

0.066384 

124 

0.061162 

174 

0.066449 

125 

0.061292 

175 

0.066530 

126 

0.061420 

176 

0.066613 

127 

0.061547 

177 

0.066705 

128 

0.061679 

178 

0.066789 

129 

0.061802 

179 

0.066872 

130 

0.061933 

180 

0.066934 

131 

0.062065 

181 

0.067007 

132 

0.062179 

182 

0.067098 

133 

0.062293 

183 

0.067176 

134 

0.062430 

184 

0.067258 

135 

0.062553 

185 

0.067333 

136 

0.062667 

186 

0.067418 

137 

0.062784 

187 

0.067486 

138 

0.062894 

188 

0.067569 

139 

0.063010 

189 

0.067628 

140 

0.063128 

190 

0.067720 

141 

0.063245 

191 

0.067794 

142 

0.063344 

192 

0.067871 

143 

0.063459 

193 

0.067952 

144 

0.063550 

194 

0.068022 

145 

0.063666 

195 

0.068103 

146 

0.063763 

196 

0.068178 

147 

0.063899 

197 

0.068237 

148 

0.063985 

198 

0.068315 

149 

0.064101 

199 

0.068388 

150 

0.064197 

200 

0.068459 
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Table  II.  Modified  Hanson-Knoopmans  Constants  for  Basis  Value 


£ 

r 

s 

Jc 

2 

1 

2 

35.177 

3 

1 

3 

7.859 

4 

1 

4 

4.505 

5 

1 

4 

4.101 

6 

1 

5 

3.064 

7 

1 

5 

2.858 

8 

1 

6 

2.382 

9 

1 

6 

2.253 

10 

1 

6 

2.137 

11 

1 

7 

1.897 

12 

1 

7 

1.814 

13 

1 

7 

1.738 

14 

1 

8 

1.599 

15 

1 

8 

1.540 

16 

1 

8 

1.485 

17 

1 

8 

1.434 

18 

1 

9 

1.354 

19 

1 

9 

1.311 

20 

1 

10 

1.253 

21 

1 

10 

1.218 

22 

1 

10 

1.184 

23 

1 

11 

1.143 

24 

1 

11 

1.114 

25 

1 

11 

1.087 

26 

1 

11 

1.060 

27 

1 

11 

1.035 

28 

1 

12 

1.010 

29 

1 

— 

1 

30 

2 

12 

1.373 

31 

2 

12 

1.344 

32 

2 

12 

1.315 

33 

2 

13 

1.270 

34 

2 

13 

1.245 

35 

2 

13 

1.221 

36 

2 

13 

1.197 

37 

2 

13 

1.174 

38 

2 

13 

1.151 

39 

2 

13 

1.129 

40 

2 

13 

1.108 

41 

2 

14 

1.083 

42 

2 

14 

1.064 

43 

2 

14 

1.045 

44 

2 

14 

1.027 

45 

2 

14 

1.009 

46 

2 

— 

1 
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r  Ta!?le  11 ’■  Simulation  Results/Computing  Allowable  Valup 
Coverage  Rate  <%)  Versus  CV  Differences  Norma!  -  Normal  Distributions 


CV 

CV(1) 

CV(2) 

.10 

.10 

.10 

.12 

.10 

.14 

.10 

.16 

.10 

.18 

.10 

.20 

.12 

.10  . 

95.6 
(93.8)* 

92.6 


88.2  89.2 

83.2  83.0 

99.8 _ 99.8 

CV(i)  =  s(i)/m(i),i  =  i,2 


86.6 

(84.2)* 

81.0 

72.2 

64.2 
99.6 


95.8  78.8 

(92.8)*  (72.8)* 

89.4  59.4 


83.4 

72.6 

99.8 


41.2 

24.6 

100 


m(1 )  ss  200  ,  m(2)«50 


Assumed  Distribi^ions^are^N(m(l^s{1j^>^N(m(^^s^^_=^ormal  distribution 

r  Modified  Hanson-Koopmans 
Nonparametric  Bayes  (Ferguson) 

RRM  *  Reduced  Ratio  Method  (Mil-5) 

Sample  size  n(l)  -  30(prior),  n(2)  -  6(data)  for  cases  except  (  )• 
sample  size  n(1 )  =  60,  n(2)  =  6  M  ' 


Table  IV.  Simulation  Results/Computing  Allowable  Values 
Coverage  Rate  (%)  Versus  CV  Differences  Normal  -  Weibull  Distributions 


CV 

Coverage  Rate  (%) 

’B’  Allowables 

’A’  Allowables 

CV(1)' 

CV(2) 

MHK 

Bayes 

RRM 

MHK 

RRM 

.10 

.10 

98.6 

99.2 

97.8 

99.6 

88.6 

.10 

.12 

98.0 

98.4 

90.8 

98.4 

68.6 

.10 

.14 

94.0 

94.2 

82.2 

94.4 

48.0 

(94.2)* 

( — r 

(90.6)* 

(96.0)* 

(63.6)* 

.10 

.16 

89.0 

89.6 

73.4 

89.2 

29.0 

.10 

.18 

84.8 

86.8 

65.0 

82.0 

19.6 

.10 

.20 

76.0 

76.6 

57.4 

69.0 

14.0 

.12 

.10 

99.8 

99.8 

99.4 

99.6 

98.2 

CV(i)  =  s(i)/m(i) .  i 

=  1,2 

m(1)  =  200  ,  m(2)  =  50 

Distributions  N(m(1),s(1) ),  W(a(2),b(2) ) 
where  N  and  W  are  Normal  and  Weibull  models 
for  prior  and  current  data  sets  respectively 

a(2)  =  shape  parameter  and  b(2)  =  scale 
determined  for  prescribed  CV  in  columns  1  and  2 

_ *  sample  size  n(1)  =  15,  n(2)  =  6 _ 


Table  V.  Range  and  Coverage  (%)  Versus  Sample  Size/Methods 

Normal  Distributions 


Method 

Range  (%)  of  ’B’  Allowable 

Coverage  (%) 

(n(1),  n(2)) 

01 

50 

99 

’B’  Allowable 

RRM  (15,6) 

27.19 

35.55 

44.28 

86.6 

RRM  (30,6) 

29.10 

36.80 

44.58 

81.0 

RRM  (60,6) 

30.32 

37.52 

44.70 

73.0 

RRM  (30,15) 

33.42 

38.23 

43.84 

72.8 

MHK  (15,6) 

20.49 

32.95 

42.87 

92.8 

MHK  (30,6) 

25.12 

34.51 

43.25 

91.8 

MHK  (60,6) 

27.40 

35.19 

42.98 

88.4 

MHK  (30,15) 

29.06 

36.17 

42.07 

90.0 

n(1)  »  L  sample  size 

CV(1)  =  .10 

n(2)  *  S  sample  size 

CO 

II 

c\T 

> 

O 
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Appendix  A.  Transformation  Process  for  Pooling  PDF's 
of  Small  Data  Set(S)  and  Large  Data  Set(L) 


Appendix  B.  PDF's  for  Both  Pooled  and  Single  Data  Sheets 


Transformed  Strength  Data 


THE  PRINCIPLES  OF  SKIP-LOT  SAMPLING  AND  A 
COMPARISON  OF  SAMPLING  FREQUENCY  OPTIONS 
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and 
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INTRODUCTION 


1.  The  American,  British,  Canadian  and  Australian 
(ABCA)  Armies,  plus  New  Zealand  in  an  observatory  role, 
jointly  participate  in  a  standardization  programme  on 
proofing,  inspection,  and  quality  assurance  (PIQA) .  This 
quadripartite  working  group  (QWG)  is  mandated  to  investigate 
areas  where  opportunities  for  standardization  exist  and 
develop  appropriate  standardized  procedures  for  implementing 
specific  PIQA  techniques  in  the  field. 

2.  A  large  component  of  these  areas  of  investigation  is 
the  application  of  statistical  sampling  methods  for  inspection 
purposes.  Skip-lot  sampling  is  one  statistical  technique 
currently  being  examined  under  project  number  QA/29  "Skip-Lot 
Sampling."  A  quadripartite  advisory  paper  (QAP  28)  has 
recently  been  published  that  explains  the  principles  of  this 
technique  [ 1 ] . 

3.  The  purpose  of  the  present  report  is  two-fold.  The 
first  section  briefly  introduces  the  principles  of  skip-lot 
sampling  by  means  of  summarizing  the  contents  of  QAP  28.  This 
will  include  identifying  the  conditions  under  which 
implementing  a  skip-lot  sampling  plan  may  be  warranted  or 
beneficial. 


4.  The  second  part  focuses  on  the  comparison  of  three 

different  sampling  frequency  options  that  could  be  used  during 
the  skipping  phase  of  any  skip-lot  sampling  plan.  The  quality 
assurance  properties  of  each  sampling  frequency  option  are 
derived  by  utilizing  the  simplified  Markov  Chain  approach 
developed  by  Brugger  [2  and  3].  Some  advice  is  provided 
regarding  which  option  is  most  appropriate  under  economical 
constraints  or  when  one  wishes  to  protect  against  quality  deterioration. 
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OVERVIEW  OF  SKIP-LOT  SAMPLING1 

5»  Skip-lot  sampling  inspection  can  be  defined  as  "an 

acceptance  sampling  procedure  in  which  some  lots  in  a  series 
are  accepted  without  inspection,  when  the  sampling  results  for 
a  stated  number  of  immediately  preceding  lots  meet  stated 
criteria."  Interpreting  this  definition  in  a  less  formal 
sense,  a  skip-lot  sampling  plan  involves  sampling  some  lots 
according  to  established  procedures  at  a  specified  sampling 
frequency  level,  while  other  lots  are  accepted  without 
inspection. 

Herein  lies  both  the  purpose  of  skip-lot  sampling 
and  its  disadvantages.  There  are  various  ways  of  reducing  the 
inspection  effort  on  products  where  a  demonstrated  high 
is  being  maintained  by  the  supplier.  At  one  extreme, 
sampling  could  be  discontinued  altogether.  This  is  not 
recommended  for  the  obvious  reason  that  a  deterioration  in 
process  quality  could  remain  undetected  for  some  time. 

option  is  to  reduce  the  sample  size  of  items  inspected 
within  a  lot,  which  is  the  method  recognized  in  ABCA 
Quadripartite  Standardization  Agreements  (QSTAGs)  105  and  330. 
Skip-lot  sampling  offers  yet  another  option  which  is  applied 
to  the  lots  as  opposed  to  the  individual  items.  This  is  not 
to  say  that  a  skip-lot  sampling  plan  must  be  applied  in  place 
reduced  inspection  when  it  is  deemed  more  cost-effective.5 
Because  one  is  applying  these  at  different  stages  on  the 
inspection  process  it  is  entirely  possible  to  overlay  a 
skiP“l°t  sampling  plan  over  a  variety  of  inspection  plans  that 
are  applied  to  the  individual  items •  This  is  analogous  to  the 
commonly  used  statistical  technique  known  as  two-stage 
sampling  since  the  sample  drawn  takes  place  in  two  steps. * 
Reference  [6]  indicates  that  for  two-stage  sampling,  any  type 
of  sampling  can  be  employed  at  each  step  and  combined  to  form 
an  overall  sampling  plan. 

7 •  Implementing  a  skip-lot  plan  also  adds  another 

degree  of  risk,  since  an  occasional  bad  lot  might  be  accepted 
without  inspection.  As  will  be  illustrated  later  when  the 
construction  of  skip-lot  plans  are  discussed,  this  risk  can  be 
controlled  through  close  monitoring  of  the  production  process, 


This  section  draws  heavily  from  the  contents  of 
reference  1. 

2  Proposed  International  Standards  Organization  (ISO) 
definition,  reference  4. 

3  p.  252  reference  5. 

4  p.  274  reference  6. 
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plus  the  inclusion  of  intermediate  retrial  stages. 

8.  Skip-lot  sampling  is  beneficial  when  the  lot 
inspection  or  testing  is  destructive  or  costly.  Also,  if 
availability  of  inspection  personnel  or  test  equipment  is 
limited,  skip-lot  sampling  may  be  useful.  There  are  three 
conditions  which  must  be  satisfied  before  skip-lot  sampling  is 
applied  to  a  product: 


a.  the  product  must  be  grouped  into  identifiable  lots 
as  it  is  being  produced  or  presented  for 
inspection ; 

b.  production  must  be  stable,  to  assure  a  homogeneous 
series  of  lots;  and 

c.  quality  must  be  high,  as  demonstrated  by  previous 
history  of  the  item. 


9.  One  of  course  should  always  investigate  whether 
there  are  any  incremental  administration  complications  or 
costs  associated  with  incorporating  a  skip-lot  plan,  and 
ensure  that  they  are  less  than  the  benefits  to  be  garnered  by 
skip-lotting. 


CONSTRUCTION  OF  SKIP-LOT  PLANS 

10.  The  most  convenient  way  to  understand  skip-lot 
sampling  is  by  means  of  illustration.  Figure  1  shows  a 
skip-lot  plan  in  its  simplest  form,  a  qualification  stage  and 
one  level  of  sampling.  The  process  begins  at  the 
qualification  phase  and  one  remains  in  this  phase  until  i 
consecutive  lots  have  been  accepted.  When  this  condition  is 
satisfied,  one  switches  to  the  skipping  phase. 

11.  For  this  example,  there  is  only  one  level  of 
sampling  in  the  skipping  phase,  hence  one  either  samples  each 
successive  lot  with  probability  f,  or  the  lot  is  skipped 
(accepted  without  inspection)  with  probability  1-f. 

12.  There  are  some  key  features  in  Figure  1  that  should 
be  part  of  any  skip-lot  plan.  First,  the  process  should 
always  begin  with  100%  inspection  of  lots.  This  ensures  that 
the  product  has  a  suitably  high  quality  history  for 
skip-lotting  to  be  applied.  Next,  the  selection  of  lots  to  be 
sampled  during  the  skip  phase  should  be  done  at  random  with 
probability  f  of  being  selected. 


* 
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13.  Perhaps  most  important,  one  should  determine  how 
responsive  the  process  must  be  to  an  adverse  shift  in  lot 
quality.  A  properly  designed  plan  should  respond  to  an 
adverse  shift  by  resuming  100%  lot  inspection.  The  idea  of 
delaying  the  decision  of  changing  inspection  phases  is 
usually  added  to  a  skip-lot  plan  to  protect  against  a 
premature  action  of  reinstituting  100%  inspection  and 
requalification  of  the  product  when  only  a  single  rejection 
occurs,  as  shown  in  Figure  1. 

14.  The  intermediate  step  acts  as  a  checkpoint  to 
scrutinize  the  next  few  lots  (based  on  experience,  a  minimum 
of  four  lots  are  recommended)  to  determine  whether  the  lot 
quality  has  experienced  an  abrupt  change  or  not.  If  during 
this  step  no  additional  lots  are  rejected  the  process 
returns  to  the  skipping  phase.  If  however,  the  process 
fails  to  meet  the  criterion  for  returning  to  the  skip  phase, 
100%  inspection  resumes  at  the  qualification  phase  and  the 
process  of  requalifying  by  accepting  i  consecutive  lots 
starts  over. 


15.  In  practice  one  needs  to  consider  several  factors 
to  determine  how  responsive  the  process  should  be.  Some  of 
the  factors  are: 


a.  The  cost  of  nonconforming  lots  passing  through  the 
process  without  inspection  and  onto  the  consumer; 

b.  The  cost  of  inspecting  a  lot;  and 

c.  The  likelihood  of  significant  sudden  change  in 
quality  occurring. 


16.  Assessing  the  weight  or  importance  of  each  of  the 

above  factors  will  assist  the  designer  of  a  skip-lot  plan  in 
deciding  whether  they  require  responsive  action  in  the  form 
of  immediate  resumption  of  100%  inspection,  an  increase  in 
sampling  frequency,  or  postponing  a  decision  by  performing 
an  intermediate  inspection  step. 


17 .  There  are  obviously  many  possible  combinations  of 

designing  responsiveness  into  a  skip-lot  plan.  If  one  is 
unsure  about  the  consequences  an  abrupt  shift  in  lot  quality 
would  have,  they  should  consult  a  statistician  or  quality 
engineer  that  is  familiar  with  acceptance  sampling  methods. 


QUALITY  ASSURANCE  CHARACTERISTICS 


18.  There  are  three  quality  assurance  characteristics 
that  are  normally  used  to  evaluate  the  effectiveness  of  lot 
sampling  plans.  Each  of  these  statistics  can  be  graphed 
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against  the  expected  quality  of  the  materiel  entering  the 
sampling  procedure.  The  resulting  graphs  can  be  used  as 
guidelines  in  designing  an  effective  skip-lot  plan,  or  in 
evaluating  a  plan  already  in  operation. 

19  •  The  first  of  these  statistics  is  the  average 

fraction  inspected  (AFI) .  This  is  the  expected  percentage 
of  lots  which  would  be  chosen  for  inspection  for  a  given  lot 
quality.  Figure  2  gives  a  graph  of  AFI  vs  lot  quality  for 
the  sampling  plan  in  Figure  1  with  i  -  10,  f  ■  0.5  and 
"  “  4-  there  was  no  skip  phase  the  AFI  would  equal  one. 
As  can  be  seen  in  the  graph,  the  skip-lot  sampling  plan 
reduces  the  number  of  inspected  lots  for  a  high  incoming  lot 
quality,  but  it  increases  rapidly  once  the  lot  quality 
deteriorates.  A  procedure  for  calculating  the  AFI  is 
covered  later  in  paragraph  27. 

20.  The  other  two  statistics  are  dependent  on  the  AFI 
and  the  incoming  lot  quality.  The  average  output  quality 
(AOQ)  is  calculated  here  under  the  assumption  that  a  lot 

is  inspected  and  found  to  be  non— conforming  is 
replaced  by  a  lot  which  conforms  to  the  desired  quality.  As 
such,  a  number  of  non-conforming  lots  are  inspected  and 
removed,  thereby,  improving  the  outgoing  lot  quality.  The 
graph  in  Figure  3  shows  the  output  lot  quality.  The  AOQ  is 
expected  to  remain  high  for  the  following  reason.  When 
incoming  lot  quality  is  low,  the  AFI  is  high.  Thus,  a  large 
number  of  non-conforming  lots  are  inspected  and  replaced, 
thereby  raising  the  AOQ.  This  statistic  is  used  mainly  to 
show  that  outgoing  lot  quality  remains  high  under  skip-lot 
sampling  even  when  incoming  lot  quality  decreases.  This 
would  be  important  in  processes  where  a  high  outgoing 

is  essential.  Given  the  incoming  lot  quality,  p, 
and  the  AFI,  the  AOQ  is  calculated  by: 

AOQ  -  p  +  AFJ(l-p) 

21.  The  third  statistic  is  the  operating 
characteristic  (OC)  curve.  This  curve  gives  the  probability 
of  a  random  lot  being  accepted  under  this  procedure.  A  lot 
can  be  accepted  in  one  of  two  ways.  Either  it  can  pass 
through  the  system  without  being  inspected,  in  which  case 
both  conforming  and  non-conforming  lots  are  accepted,  or  it 
can  be  chosen  for  inspection  and  passed.  In  the  latter 
instance,  any  non-conforming  lots  are  detected  and  removed. 
Figure  4  gives  the  OC  curve  for1  the  system  in  Figure  1. 

This  curve  is  useful  in  that  it  gives  the  expected 
percentage  of  lots  that  will  be  accepted  for  a  given 
incoming  lot  quality.  The  separation  between  the  diagonal 
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line  and  the  skip-lot  curve  gives  an  indication  of  the 
additional  risk  one  takes  of  accepting  lots  as  the  lot 
quality  decreases.  The  OC  is  calculated  by: 

OC  -  1  -  AFI  ( 1  -p) 


22  •  The  API  curve  can  be  used  to  study  the  effect 

which  each  of  the  parameters  has  on  the  overall  procedure. 
The  API  curve  is  the  best  choice  of  the  three  in  most  cases 
since  the  other  two  statistics  are  dependant  on  the  AFI.  By 
comparing  the  AFI  curves  for  different  choices  of 
parameters,  a  number  of  different  scenarios  for  skip-lot 
sampling  can  be  analyzed  and  the  best  choice  used  in  the 
design  of  a  plan. 

23  •  In  order  to  calculate  the  quality  assurance 

curves,  one  must  first  create  the  transition  probability 

f°r  ®  skip-lot  sampling  plan.  Table  1  gives  the 
transition  matrix  for  the  sample  plan  in  Figure  1.  The 
entries  in  the  matrix  represent  the  probability  of 
traversing  from  one  stage  in  the  plan  to  the  next.  Note, 
from  the  qualification  phase  (QU)  one  can  only  move  to  the 
sampling  step,  hence  there  is  only  one  entry  in  the  QU  row. 
Similarly,  from  SI  one  can  only  transfer  to  the  retrial  step 
(RSI) .  There  are  two  possibilities  from  the  RSI  step.  One 
could  return  to  SI  with  probability  p*,  or  transfer  back  to 
the  QU  phase  with  probability  1-p*. 


table  1 

TRANSITION  MATRIX 


QU 

SI 

RSI  I 

QU 

— 

1 

1 

SI 

— 

1 

RSI 

1-P* 

P* 

— 

24  •  In  general  interpreting  the  transition  matrix  can 

be  accomplished  by  reading  across  the  rows  for  the  step  or 
phase  which  one  is  exiting,  and  down  the  column  of  the  step 
'  which  one  is  entering. 

25.  From  the  transition  matrix,  a  simplified  Markov 

chain  approach  can  be  used  to  calculate  the  steady  state 
probabilities.  First,  the  number  of  times  which  the  process 
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enters  each  stage,  represented  by  T(S) ,  can  be  calculated  in 
terms  of  the  qualification  stage  as  follows: 

T(QU)  =  T (QU)  (3) 

T(S1)  -  T (QU)  +  pxT(RSl) 

T(RS1)  =  T  (SI) 

solving  in  terms  of  T(QU): 

T(S1)  =  T  (QU)  +  pxT  (SI) 

T(S1)  »  T (QU)  /  (l-px)  (4) 

and 

T(RS1)  «  T(S1)  =  T(QU)  /  (l-px)  (5) 

26.  A  working  table  can  be  created  once  these 
calculations  are  completed,  shown  as  Table  2.  Column  1  of 
the  table  gives  the  relative  number  of  times  the  process 
enters  each  stage.  These  terms  are  given  as  the 
coefficients  of  T(QU)  in  equations  (3)  -  (5)  above.  Column 
2  is  a  simplification  of  the  terms  in  column  1. 

27.  The  terms  in  the  first  two  columns  only  give  the 
number  of  times  entering  each  phase.  Column  3  gives  the 
expected  number  of  lots  in  each  phase,  and  column  4  is  a 
simplification  of  column  3.  Hence,  column  4  is  the  relative 
number. of  lots  expected  each  time  the  process  enters  the 
phase.  By  multiplying  column  2  by  column  4,  we  get  the 
relative  total  number  of  lots  in  each  phase.  This  is 
recorded  in  column  5.  The  steady  state  probabilities  in 
terms  of  lots  to  be  inspected  during  each  phase  is  then 
obtained  by  taking  each  entry  in  column  5,  and  dividing  it 
by  the  sum  of  the  values  from  column  5  (denoted  by  D) .  The 
AFI  is  calculated  from  column  5  of  the  working  table  by 
multiplying  each  term  in  the  column  by  the  corresponding 
sampling  frequency  for  that  step,  then  summing  these  terms 
together  and  divide  by  D  to  obtain  the  AFI. 
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WORKING  TABLE 
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28.  If  specific  values  are  given  for  i,  f,  and  x  (ie:  10,  .5,  and 

respectively),  column(5)  would  look  like: 
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SAMPLIW?  FREQUENCY  eoqpflpjcy 

skip-lot  sSpli^pJans^riJinated1?9  fr^ency  options  for 
first  author's  involvement  ff  s  ®by~Product  of  the 

for  QWG/PIQA  project  QA/29  D^rin?^  ia"  project  officer 
an  advisory  paper  ril  m  Jh.  !zfin?  *he  Process  of  producing 
it  was  noted  that  it\ls  co ZJ£i?Cipi**  °f  s^ip-lot  sampling 
frequencies  for  plans  in  an”?*11**0  Shoofe  the  sampling 
formal  thought  in  select ina  fa?hion*  This  lack  of 

the  initiation  of  this  study*  P  frecIuencies  prompted 

personnel  withnsome°cone?*ff°Vi?2  qualitir  assurance 
sampling  frequencies  three  d??JdanCf  °n  felectin9  suitable 
defined.  Th2y  are  ta  in  op4?ns  hav«  been 

determining  the  sampling  f?e™e£™  grating  functions  for 
in  the  skipping  phase.  successive  stage 

functions  for  the  three  lisJ?  the  generating 

compared.  ee  s*mPlir"3  options  that  are  to  be 

31 

from  thoseTMmplieg?Jr^eSe?!ratin?  functions  were  derived 
skip-lot  plans,  and  fSTSSSa  °fte?  applied  for 

sampling  initially  aSd  mo of  conservative 
sampling  at  subsequent  stacmeCOn2mifal  ^}ess  frequent) 
the  skip  phase  with  samDlina  f?Che°?tion  therefore  starts 
2  lots  being  inspe^teT^i**  A^ns'lS'  °r  1  ^ 


Option 


TABLE  4 

SAMPLIWg  FREQUENCY  r.pVmTING  PTTT,Pg 

Genera  ting 
Function 


A 


for  n  -  1, 2,3,4 


B 


for  n  -  1,2, 3, 4 


C 


where 

for  n  -  1,2, 3, 4 
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conservativeness  throughout,  reducing  the  sampling  frequency 
slowly,  whereas  option  B  is  the  most  liberal  option  in  terms 
of  the  fraction  of  lots  being  inspected.  Option  C  (also 
referred  to  as  the  Fibonacci  option) ,  meanwhile,  duplicates 
the  conservativeness  of  Option  A  for  the  early  sampling 
stages,  then  begins  to  relax  the  inspection  rate  with 
sampling  frequencies  set  between  the  two  extremes. 

32.  These  features  of  Option  C  make  it  a  good 
candidate  as  a  compromise  solution  for  someone  who  is 
concerned  with  both  keeping  inspection  costs  down  and 
retaining  good  responsiveness  for  detecting  quality  changes. 

33.  Figures  5,  6,  and  7  illustrate  skip-lot  sampling 
plans  with  a  qualification  phase,  followed  by  a  skip  phase 
that  contains  four  sampling  stages.  Each  sampling  stage 
includes  an  intermediate  retrial  step  to  check  for  quality 
degradation.  Figure  5  depicts  the  plan  for  sampling 
frequencies  generated  from  option  A.  Figure  6  gives  the 
equivalent  plan  for  frequencies  obtained  from  option  B,  and 
similarly.  Figure  7  provides  the  plan  derived  from  option  C. 

34.  The  remaining  portion  of  this  report  compares  the 
quality  assurance  properties  of  these  three  sampling  plans 
in  terms  of  their  respective  operating  characteristic  (OC) , 
average  outgoing  quality  (AOQ) ,  and  average  fraction  of  lots 
inspected  (AFI)  curves.  Initial  values  are  assigned  to  the 
clearance  numbers  (i,  x,,  x2,  and  x3) .  Later,  a  sensitivity 
analysis  is  performed  on  the  clearance  numbers  to  measure 
their  impact  on  the  shape  of  the  aforementioned  curves. 

35.  The  transition  matrix  and  working  table  for  this 
four  stage  process  is  derived  in  Annex  A.  The  graphs  in 
Figures  8,  9,  and  10  are  the  AFI,  AOQ,  and  OC  curves  for  the 
three  sampling  frequencies.  In  creating  these  graphs,  the 
values  for  i,  x1 ,  x2,  and  x3  are  10,  5,  10,  and  15 
respectively. 

36.  The  AFI  curves  shown  in  Figure  8  demonstrate  the 
differences  between  the  three  options.  When  the  lot  quality 
is  very  high  (above  95%) ,  the  average  fraction  inspected  is 
very  low.  Also,  the  rate  at  which  the  AFI  rises  as  lot 
quality  decreases  is  very  slow.  However,  when  the  lot 
quality  drops  lower  (below  90%) ,  the  AFI  increases  rapidly. 
Therefore,  a  small  change  in  lot  quality  in  this  range  will 
result  in  a  large  change  in  the  average  fraction  of  lots 
inspected.  This  is  characteristic  of  all  three  curves.  At 
high  quality  levels,  the  AFI  for  sampling  option  A  is 
greater  than  for  option  C,  which  in  turn  is  larger  than 
option  B.  The  lower  AFI  is  more  economical,  however,  it 
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QUALIFICATION  STAGE 


146 


QUALIFICATION  STAGE 
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QUALIFICATION  STAGE 
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Option  A  >  -  Option  B  .  Option  C 


SEitiX.“E  srsrjsfts  if*  to  f** 

?hLr*?”?;rrjsnt^^LSrsT^n?thrc“-^-%yti0n. 

detection  of  a  shift  in  qualify  by  SoJitSJS^thS  S!*®* 

O  *7  M 


shown  by  the  AOQ  curve  in  piourl  o  ^Jf}"9  1<5ts  passed  is 
that  the  average  output  qualitv  is  i  curve  demonstrates 
largest  for  option  a  viS  ih.a^  i°West  for  opti°n  B  and 
Another  characteristic  shown  bv  th<f°r  °Pwi?n  C  in  between. 
remains  high,  even  when  lot  o5Ji??5%9raph  is  that  AOQ 
for  this  w.s  The  r“sen 

A  4% 


lot  qualityfethereUisea1hiah9n^oh1K4f?OWS  *****  for  BUPerior 
accepted,  however  this  donro»?  ability  of  a  lot  being 
decreases.  Option  A  has  th«»afeS  r?p*dly  when  lot  quality 
acceptance  of  the  thro!  ^h?  lowest  Probability  of  7 

option  B.  The  options  witf^tho' by  °Ption  C,  then 
to  reject  .  snall.r o?L J „AFI KW°“ld  be  «*Pected 
of  lots  would  be  accepted,  giving'i  Si&S  JSST  nUfflber 

<5  A  _ _ 


atidy  the  SilcSTSTilS^ »lso  b«  used  to 
denoted  by  x,,  in  the  skio-ilS-  *~  cl®ara«ce  variables, 
in  Figure  11  shows  the  AFX  cumffor^’i  ,^he  9raph  given 
the  sampling  frequencies  are  nen«Mf  variations  in  xt,  when 
(Fibonacci).  As  e*I  h!  .!  generated  from  option  c 

level  is  high,  the  AFI  is  5he  graph'  when  the  quality 

different  choices  H^vel  vh^  IF*  U“*e  for  «>e  lty 
decreases,  the  AFI  shows  a  mar£f3  thequaiity  of  the  lots 
value  for  x,  increases,  the  AFi  Li  ^6,  When  the 
tUUS#  ?  larger  value  for  x.  5iiii?^®?fefris^n,ultaneously- 
the  value  of  the  statistics.  This  larger  chan9es  in 

chance  of  detection  when  negat^^es^"?!??  Sur 


tor  API  anftSrothe/^SstTci  S£T'  that  the  values 
state  Pr°cess.  if  the  situa-Mrtn  fre  g*yen  for  a  steady 
Of  lots  being  produced,  thSS  a  JerL  S^  °f  a  llmited  number 
an  adverse  effect  on  the  skio^lrt  for  xi  would  have 

ajri’ssajs:  *s.-S; inSsr? »•  &2vsr 

C‘“n  iBto  •ccount  “  «Wp-iot  pi«  !;  SfS 


variables  iS^^Se^aM^I^  ?he  °ther  clearance 
differences  in  the  API  are  smaller  than’f^riiSJ.^hange 
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Figure  10 

OPERATING  CHARACTERISTIC 


Option  A  -  -  Option  B  .  Option  C 


Figure  1 1 

CLEARANCE  NUMBER  SENSITIVITY 
(Varying  XI) 
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0.05 


SUMMARY 


statistie»?hT^«-r?port  h?s  intr°duced  the  reader  to  the 
sampling.  The  prin^iplerbehind^®  kn2!n<,“  Skip_lot 

ass:  •S2- ss,s‘s.;.ss-s.'a““ 


different  Molina*1^*1*  of  the  report  dealt  with  three 
multi-stage  skin-iof1^906??3^  °Ptions  that  could  be  used  for 
comparedbv  ierivlnt  S3 1*%  Pla??*  **•*?  options  were 
cha?acteristics  using  tte  fif!n??S?irwqU!lity  assuran<=e 
It  was  noted  that  the  semolina1  *51 6d  Markov  chain  approach, 
from  the  Fibonacci  aeonJUSi*  fre£uency  option  C,  generated 
compromise  between  °f  numbers  offered  a  good 

maintaining  a  high  de5rM  of9i-»«^in?P6Cti°n  •ffort  and 

deterioration.  9  of  «sp°nsiveness  to  lot  quality 


showed  that  th^cleMMcrS”  T  M'socUt'ed^th”^*" 

custom izeany  skiD-lot  SlS  Jainonstrated  that  one  can 
clearance  numbers?  1  Pl  through  stipulation  of  the 
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annex  a 

DATED  OCTOBER  199; 

^  ^TBEMATIPAT>  roMPr^^jj^yg. 

entries  where  it°ia  £h®  bransition9matI^  within  5'  6  and 
by  the  row  to  2L1!  possible  to  transit  fro®  Jh2  nrZero  cel1 
step  being  entered. orresponding  column,  representing^^6” 

TABLE  ft-] 

HEW  VEP«?Tr>|i 


QU  SI  Rsi 


QU 

1 

si 

- 

RSI 

1-p*  p« 

S2 

- 

RS2 

1-P*  - 

S3 

-  - 

RS3 

1-P*  - 

S4 

-  - 

RS4 

1-P4  - 

S2  RS2  S3  RS3  S4 


RS4 


“  1-pxi  pxi 


-  1-PX2  pX2 

p*  - 


1-PX3  pX3 
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ANNEX  A 

DATED  OCTOBER  1991 


2.  The  stationary  probabilities  in  terms  of  sequences 

or  runs  of  each  step  can  be  determined  by  summing  down  the 
respective  columns  of  this  matrix.  These  equations  can  be 
restated  in  terms  of  T(Qu)  which  gives  the  expressions 
identified  by  the  bracketed  numbers  for  each  step. 

(1) 

T(QU)  -T(QU) 

T{S1)  mT(QU)  +P4-  T(RS1) 


(2) 

(3) 


T(RSi)-(l-PXl)T(Sl) 

T(S1)  -  T(QU)  +P4  (l-PXl)  T(S1) 

T(S1)  - - - 

1-P4  (l-PXl) 


T(RS1) 


d-pXi)  -_nou) 

1-P4  (1-PX1) 


T(52)  -PXlT(51)  +P*T(RS2) 


T(RS2)-(1-PU)  T(S2) 


T(S2)  -PXlT(Sl)  +P*  (l-PXj)  T(S2) 


T(S2)-  **£.<g*L. 

l-P4(l-PXi) 
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T(S2)  - _ P^TlQU) _ 

[1-P4  (1-J>*»)  ]  •  Il-p<  (1-R**)  ] 


T(RS2)  - _ PXlT<nrn _ 

[1-J>4(1-P*»)J  [l-^d-pXajj 


T(S3)  -  P*»  T(S2)  +P*T  ( RS3 ) 
T(RS3)  -  (l-p*>)  T(5J) 

T(S3)  -P*>T(S2)  +P*  (1  -p*»)  T(S3) 

T(S3 ) - j^T{S2) 

1  -P*  (1  -PXi) 


T{S3)  _ _ _ _ PXlPx*T(OU) 

11~P4  [l-jHd.pfajj- 


msj) - _  P^g-P^mom 

[1-P4(1-P*M]  tl-P4(l-p^)]  [l-p4(1_pXj)] 

T(S4)-PX*T(S3)  +  P4r(PS4) 
r(R54)-T(S4) 


T(S4)  -  pX'T(S3) 
1-P * 
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(8)  (9) 

nRS4) -TIS4) - p^p^nom - 

[i-p‘ ti-p*(i-p,,,n  d-p'd-p11)]  a -p‘) 


3.  In  order  to  obtain  the  steady  state  probabilities 

in  terms  of  lots  one  must  extract  the  coefficients  from 
equations  1-9  and  multiply  them  by  the  expected  length  of 
each  step  in  terms  of  lots.  A  working  table  can  be 
constructed  (Table  A-2)  to  assist  in  keeping  track  of  the 
algebraic  operations  that  need  to  be  carried  out.  Reference 
[2]  provides  the  justification  for  using  this  working  table 
method . 
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table  a-p 


working 


STEP 

QU 


COLUMN  1 

COEFFTPT^t  qf  t,qTT) 
1 


SI 


1 

1-P4(1-P*1) 


RSI 


(1-P*M 

1-P4(1-P*>) 


S  2 


_ _ pt  1 

U-P4(l-P*i)]  [1-P«d_pxl)] 


RS2 


_ p^sizihi _ 

[l-P4(l-pXi)j  j 


S3 


pX\p%2 

ll-P1  (1--P1') ]  [l-f<  <!-.?>■)] 


RS3 


S4 


RS4 


_ PXlP,a(l-PxM 

n-p-(i-f>'>)]  u-p<(1-p<.)] 


_ _ _ _ pXipli  pt j 

U-i>*(i-P«.)j  (1.p^ 


-  _ phpXipX) 

ll-P4(l-P**)]  [1-pVi-pXijj  [l-P4(l_p,J)] 
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WORKING  TABLE  ( CONTINUED) 


COLUMN  2 

SIMPLIFICATION  OF  COLUMN  1 


[l-P4(l-PXl)  ]  [1-P4  (l-P*2)]  [l-P4(l-PXl)]  (1-P4) 


COLUMN  3 

EXPECTED  NO.  OF  LOTS 

(l-P1) 

Pi(l-P) 


[1-P4  (l-P*2)]  [1-P4  (l-P*2)]  (1-P4) 


(l-PXl) 
A  (1-P) 


4 

(i-pXl)  [i-P4(i-Px,)3  [i-p4(i-p*2)]  (l-P4) . 


pXl  [l— p*  (i-PXl)  ]  (l-P4) 


(l-p*2) 

f2(i-P) 


4 

pxi(l_px»)  [i-p«(i-px>)]  (l-P4) 


PXlPX2(l-P4) 


(1-P*2) 

f3(l-P) 


4 

PXlPX2(l-P*2)  (1-P4) 


ptipXapX} 


1 

f4(l-P) 


pXipXipXs 


4 
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COLUMN  4 

glTOWTCKTlOW  qi 
co  lumn  y  ^ 

-p*) 


4pl(1-«  4V,/, 


HQRKlNn  TART.f  f^"ftTrm,pn) 

COLUMN  5 
•^»  X  m 

V,f.r‘u.Pr>>  n-PUi-P^ui-^a-^  (1.p, 


*it,f3e.*PUi -p)  a -p«.)  u.p. ,  fl_ 


P4(l-pXl)J  ( 


Piflf3f4<l-P*) 


^  ^  *  (l~PXi )  i*  fl-P4  (1-P*3)  ]  (1. 


P4) 


*it,f>f.iP‘(l-P)pr>(1.p„)  U.p4(1_ptt)]  (1_ 


P4) 


piflf2^i<l-PXl) 


flf2^4PUl -PXi  )  ptipX  2  ( 1  _p4  j 


(l-P)  P*^Xa  (i.pX,)  (1_ 


P4) 


P%f2f3 


A  -^2  ^3  &  *p*ip*2 pt 3 


4p,(i-«  v,/,  /. 


^1^2  ^3^4  4  P  1  (1-p)  pt ipt,pt. 
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WORKING  TABLE  ( CONTINUED 
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ANNEX  A 

DATED  OCTOBER  1991 

4*  c°lumn  1  of  Table  a-2  Hef. 

T(QO)  from  equations  1-9  r  ,  1  h®  coeffici«"te  of 

column  1  by  getting  rid  r  t-h  Ul”n  2  is  *  simPlification  of 

-uXtiplicatiL  “  Lision  ^ 

the  column  does  not  effect  th  0n”e  a11  th*  v*lues  in 
the  expected  length  for  ^  COlunn  3  *iv“ 

<  i«  •  simplification  of  column  3  ”  *****  °f  lots'  Column 
oi  the  values  from  coluL“  ^  <i.  COlUnn  5  iS 

obtained  br^aking^a^  Pr°baMUtles  lots  are 

dividing  them  by  D  the  °  ter”S  *"  COlUmn  5  and 

by  D,  the  sum  of  all  the  terms  in  column  s. 

trequencnpti™rfroI\lbuV20ro*eCh  “*  “"PUn9 

manner  as  was  explained  (  °n®  proceeds  in  the  same 

for  th.  simple  skip-lot  cLT^TJ7^  T  ^ 

are  first  multiplied  bv  *-h  he  teros  in  column  5 

to  the  respective  skip-lot  plalT^M  freqUenCy  COrrespondin9 
multiplying  the  column  5  entr^  for  *° 

and  the  retrial  stages  by  one  n00%  1  *  qaalification  st*<3e 

sampling  stages  bv  the-ir-  nspection)  and  the 

(f*.  i  «  1,2  3  4)  The  reSP6CtiVe  saapling  frequencies 
snd  divided  by  D  to  obtain  the  API.  *  together 
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Toxic  Fumes  Reduction  Modifications 
to  the  M2A2  Bradley  Fighting  Vehicle 

Linda  L.  C.  Moss  and  William  P.  Johnson 
U.S.  Army  Ballistic  Research  Laboratory 
Aberdeen  Proving  Ground,  MD  21005-5066 


ABSTRACT 

Carbon  monoxide  concentrations  measured  in  the  M2A2  Bradley  Fighting  Vehicle  were  high  enough  to 
?use  th^eVs  predicted  carboxyhemoglobin  level  to  exceed  the  Military  Standard  during  firings  of  the 
test  scenario.  Five  vehicle  modifications  were  proposed  to  correct  this  deficiency,  and 
experiments  were  conducted  to  identify  carboxyhemoglobin  levels  of  the  modified  vehicles  rdative  to  the 
baseline  vehicle  The  sample  data  was  generalized  to  determine  the  probability  that  each  configuratio 
toric  fumes' acceptance  .es,.  Recommendations  for  permanent tveh.de  mod, ftca, tons  we 
made  to  the  Program  Manager-Bradley  Fighting  Vehicles  based  on  the  study  results. 


1.  INTRODUCTION 

The  last  Quality  Performance  Test  for  new  or  modified  Bradley  Fighting  Vehicles  is  the  Toxic 
Fumes  Acceptance ¥eV  This  test  assures  that  the  crew’s  exposure  to  toxic  fumes  from  the  propellant 
mscs  oMteCapon  system  is  within  safety  limits.  The  safety  limits  for  carbon  monmode  stated  m  the 
Military  Standard  1472C,  paragraph  5.13.7.4.2  are: 

Carbon  monoxide  in  personnel  areas  shall  be  reduced  to  the  lowest  levels 
feasible.  Personnel  shall  not  be  exposed  to  concentrations  of  carbon  monoxide 
(CO)  in  excess  of  values  which  result  in  carboxyhemoglobin  levels  in  their  blood 
greater  than  the  following  percentages:  5%  COHb,  all  system  design  objectives 
and  aviation  systems  performance  limits;  10%  COHb,  all  other  system 
performance  limits . 

, 

excee^ed^he  10%  limit.  Consequently,  the  Program  Manager-Bradley  Fighting  Veh'des  formed  a  working 
group  to  develop  and  test  modifications  for  reducing  the  toxic  fumes  concentrations  ^thin  t te  Bradley 

— 

Comm  and  ^TECOM),  Activity  (AMSAA),  the  Training  and  Doctrine 

Command  (TRADOC)  and  the  FMC  Corporation. 

The  working  group  concluded  that  several  proposed  vehicle  modifications  offered  sufficient 
potential  for  reducing  th^  toL  fumes  in  the  Bradley  to  warrant  testing.  Descriptions  of  the  recommended 
modifications  are  presented  in  subsequent  sections  of  this  report. 

The  test  program  objectives  were: 


(1)  To  quantify  the  level  of  toxic  fumes  produced  in  the  unmodified,  600- 
horsepower  M2A2  Bradley  Fighting  Vehicle  during  firings  of  the  acceptance 

test. 
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(2)  To  quantify  the  relative  effectiveness  of  proposed  Bradley  Fighting 
e  icle  modifications  for  reducing  toxic  fumes  concentrations  within  the 
Bradley  Fighting  Vehicle  during  firings  of  the  acceptance  test. 

T.°  Prov,de  a  recommendation  to  the  Program  Manager-Bradley  Fighting 
ehicles  for  a  solution  to  the  Bradley  toxic  fumes  problem  based  on  test 
results  and  data  analysis. 

•  T  . .  The  acceptance  test  scenario  is  a  combination  of  TRADOC  events  12  and  15.  The  scenario  listed 
.n  Table  1  shows  the  sequence  for  firing  the  M242  25-millimeter  (mm)  gun  and  he  M2^C  762  mm 
coaxial  machine  gun.  The  method  of  firing  is  either  a  single-shot  mode  or  i  b^rst ^mode  ^ondusTon 

o  the  46-minute  scenario,  320  25-mm  rounds  and  300  7.62-mm  rounds  are  expended. 

* 

Table  1.  Bradley  Fighting  Vehicle  Acceptance  Test  (TRADOC  Events  12  &  15) 


TIME  NUMBER  OF  ROUNDS 
(min)  25-mm  7.62-m  m  MODE 


0 

2 

4 

6 

8 

10 

11 

12 

13 

14 

15 
17 

19 

20 
21 

21.5 
22 

22.5 

23 

23.5 

24 

25 

26 

27 

28 


10 

10 

10 

10 

10 

10 

10 

10 


5 

5 

5 

5 

5 

5 

5 

10 

10 

10 


21 

21 

21 

21 

42 
21 

43 


SS 

SS 

SS 

SS 

SS 

SS 

SS 

SS 

B 

B 

B 

B 

B 

B 

B 

SS 

SS 

SS 

B 

SS 

n 

B 

SS 

SS 

SS 


TIME  NUMBER  OF  ROUNDS 
(min)  25-mm  7.62-mm  MODE 


29 

30 

31 

32 

33 

34 

35 

36 
363 

37 

38 

39 

40 
403 

41 
413 

42 
423 

43 
433 

44 
443 

45 

46 


10 

10 

10 

10 

10 

10 

10 

10 

10 

10 

10 

10 

5 

5 

5 

5 

5 

5 

5 

5 

5 

10 


55 

55 


SS 

SS 

SS 

SS 

SS 

SS 

SS 

SS 

SS 

SS 

SS 

SS 

SS 

B 

SS 

B 

SS 

B 

SS 

B 

SS 

B 

B 

B 


B  •  Bum  /  SS  *  Single  Shot 


2.  PRETEST  DATA  ANALYSIS 

coTOvIws4?^  of  ;he  bfv  sy?em  *  ”**** «• «>**>  ^ 

.  8  '-yMO  values  less  than  10%.  Consequently,  our  analysis  focused  on  the  highest  COHb  value 
as  th^ehide  COHb  level  ‘ T°’  °[  cJew  P°sition-  This  value  will  be  subsequently  referred  to 

Sher  COHb  leveh  Jith  n  II  ItwT  r  a  ^  ^  M  Satisfied  the  military  standard,  all 

COHb  levels  within  the  vehicle  satisfied  the  standard.  Our  approach  was  conservative,  taken  to 
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guarantee  that  the  baseline  and/or  modified  vehicles  provided  protection  to  every  crew  position. 
Therefore,  all  analyses  to  determine  the  best  vehicle  configuration  used  the  vehicle  COHb  level. 

Percentage  data  are  frequently  transformed  using  the  arc  sine  transformation  to  stabilize  the 
variances.  However,  the  arc  sine  transformation  neither  increased  stability  of  the  variances  nor  changed 
the  results  in  terms  of  significant  differences.  Consequently,  all  calculations  within  this  report  were 
performed  with  the  vehicle  COHb  values  in  percentage  units. 

The  mean  vehicle  COHb  value  calculated  from  the  historic  data  is  8.9%  with  a  standard  deviation 
of  2.37.  This  information  was  used  to  determine  the  maximum  mean  COHb  level  that  could  be  tolerated 
with  a  given  sample  size  while  retaining  a  predetermined  level  of  confidence  that  the  population  mean 
COHb  level  would  be  less  than  10%. 

If  we  assume  the  sample  standard  deviation  in  our  tests  is  no  more  than  2.37,  then  to  verify  the 
hypothesis  that  the  true  mean  of  the  BFV  COHb  level  was  less  than  10%  with  the  desired  99.5%  level  of 
confidence,  the  sample  mean  calculated  from  a  minimum  of  six  experimental  replicates  can  not  exceed 
6.1%.  This  was  a  reasonable  goal  for  at  least  one  of  the  proposed  modifications,  the  BIGGRS  modification 
(described  in  section  5),  since  its  projected  performance  in  the  AO  vehicle  was  4.0%  .» 

The  discussion  to  this  point  has  addressed  the  criterion  established  to  provide  a  99.5%  level  of 
confidence  that  the  "mean"  vehicle  COHb  values  would  not  exceed  10%.  To  establish  tolerance  limits  with 
this  high  level  of  confidence  that  a  large  percentage  of  the  population  of  COHb  values  would  not  exceed 
10%,  a  much  larger  sample  size  would  have  been  required.  For  example,  if  we  had  wished  to  state  with 
99.5%  confidence  that  at  least  75%  of  the  population  of  vehicle  COHb  values  would  be  less  than  10%,  a 
minimum  of  30  experimental  replications  of  each  vehicle  configuration  would  have  been  necessary  when 
assuming  a  normal  distribution.  Resource  constraints  prohibited  the  testing  of  larger  sample  sizes  within 
this  test  program. 


3.  DATA  COLLECTION 

Although  the  primary  objective  of  this  test  program  was  the  documentation  of  carbon  monoxide 
at  the  crew  positions,  other  toxic  fumes  (such  as  carbon  dioxide,  ammonia  and  oxides  of  nitrogen)  were 
monitored  to  insure  that  their  levels  were  below  the  applicable  standards.  These  additional  data  will  be 
stored  in  a  CSTA  database  for  future  reference  and  analysis. 

Toxic  fumes  data  were  collected  by  the  Chemistry  Branch  of  CSTA  by  placing  sampling  tubes  in 
the  breathing  zones  of  the  driver,  gunner,  commander,  and  in  the  center  of  the  crew  compartment.  The  air 
was  continuously  analyzed  for  ammonia  and  carbon  monoxide  at  all  four  locations  and  for  carbon  dioxide 
at  the  commander  and  crew  positions.  All  measurements  were  made  with  rapid  response,  non-dispersive, 
infrared  gas  analyzers.  Oxides  of  nitrogen  (nitric  oxide  and  nitrogen  dioxide)  were  continuously  analyzed, 
at  the  commander  and  crew  positions,  by  chemiluminescent  analyzers.  Concentration  data  were  recorded 
at  a  minimum  of  four  times  per  second. 

Differential  pressures  (inside  -  outside)  were  measured  in  the  turret  by  a  capacitance  type 
differential  pressure  sensor  (-0.1  to  +0.1  psid)  and  recorded  at  a  minimum  of  four  times  per  second.  The 
interior  temperature  was  measured  in  the  turret  by  T-type  thermocouples. 

All  testing  was  completed  with  25-mm  M793  TP-T  and  7.62-mm  Ball/Tracer  ammunition. 
Firings  were  not  conducted  if  the  wind  speed  exceeded  10  mph  or  if  the  relative  humidity  exceeded  90%. 
All  firings  were  conducted  under  the  same  general  meteorological  conditions  and  as  close  together  as 
possible. 
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4.  ANALYTICAL  PROCEDURES 


monoxide  by  the^odifiecHTobur^-Fo^^  WCrC  calcu,ated  from  the  measured  carbon 


where: 


%COHbf  =  %COHb0  e',/A  +  218  (1  -  e‘,/A)  [—  +  -CQ  ] 

IB  1403  J 


.  “  *  Z5SSST.* — *  *—  <*  computation 

A  &  B  are  regression  constants  determined  by  the  work  level. 

C^Utic^t^T^^daSu^f^ o’monSr1  T™'- ,hr°0Ehou‘  test  scenario, 
rauraed  to  pre-lire  levels.  Tie  COHb  increment  waT^lc^ld  “T1  until  the  concentration 

concentration,  a  computation  interval  of  15  seconds  andaSfl  ^iT8  ^average  carbon  monoxide 

required  in  weapon  firing.  An  initial  CoSr  concentratfon  «n  av  cha^temtic  of  the  physical  exertion 
computed  COHb  concentration  from  one  iteration  b^Lml  **  pCr  MIL-STD-1472c-  The 

successive  iteration.  Fmal  COHb  concentrafions  the  beginning  COHb  concentration  for  the 

configuration.  Statistical  comparisons  of  the  mean  COHtTlevelt;6  ^  ^i!!!^  SCenario  for  each  vehicle 
assumption  that  the  COHb  levels  were  normally  distributed  an^no  madC  ^  P?rametrica,,y>  ™der  the 
comparisons,  presented  in  subsequent  sections  of  this  rennn  n.onPa[a,"etr,caI1y-  The  results  of  these 
included  herein.  th  eport’  served  as  the  basis  for  the  recommendations 


5.a.  VEHICLE  CONFIGURATIONS 


5.  PHASE  I  TESTING 


Events  12  4  15.  These  modir"  Z''1?  "nfi8uraIio,,s  “lhi"  TRADOC 

wuh  Urn  Bradley  Improved  Gun  Gas  Ren™,  5^00^0^,^“  ^  ”'d 

a  consr  ;ra:.tw/ee„c“,,f”  t  r:°rs^r  r 2  ^ 

toxic  fumes  because  of  the  internal  under-pressure  condition  i-™.  !f  •  cons,dered  a  "worst"  case  for 
engine,  the  Noah-Howden  cooling  fan  and  the  vanaxial  rotor  f^  d  Unng  COncurrent  °Pera,ion  of  ‘he 

sequence  of  the  rear  hull  fan.  ThVartivJtiontoe^f  the^ew  huffa^"  ^  m?dified  the  <Verati«»* 

2*»  1).  Hus modfiea.i^SrrrrtoeraZ”;:-®'?"  “COTds  **  «■£»  »c 
90%.  The  operational  sequence  of  the  driver’s  hull  fan  was  not  i*™6  j°“  37%  °f  the  scenario  ‘«ne  to 
trigger  release,  remained  on  high  speed  for  a  maximum  of  onem' *  fhan8ed:  “  activated  15  seconds  after 

uex,  scenario  even,.  Both  fans  directed  the  air  flow  fro™  the  eZbZlhZehl’cl'ZteZrio''''1 '°' 
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Figure  i.  Fan  Control  Box  Operational  Sequence 
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The  third  vehicle  configuration  tested  in  Phase  I  was  the  BIGGRS  kit,  which  consisted  of  three 
parts:  (a)  a  removable  cover  (or  sock)  for  the  coaxial  machine  gun  feed  chute,  (b)  improved  seals  for  the 
coaxial  machine  gun  access  doors  and  (c)  a  deflector  to  direct  the  gases  from  the  25-mm  chain  gun  breech 
to  the  vanaxial  rotor  fan  located  between  the  main  gun  and  the  coaxial  machine  gun.  However,  an 
independent  decision  by  the  PM-BFV  to  include  improved  door  seals  on  all  M2A2  future  production 
vehicles  and  on  the  test  vehicle  eliminated  the  need  for  the  seals  as  part  of  the  BIGGRS  kit.  The  improved 
seals  for  the  coaxial  machine  gun  access  doors  used  with  the  baseline  and  all  vehicle  modifications  were 
developed  and  installed  by  the  FMC  Corporation. 


An  experimental  test  design  was  constructed  to  test  the  hypothesis  that  the  mean  COHb  values  of 
the  baseline  vehicle  (BL),  the  fan  modification  (FM),  and  the  Bradley  Improved  Gun  Gas  Removal  System 
(BG)  were  equal.  The  alternative  hypothesis  was  that  at  least  one  mean  value  differed  from  the  others.  Six 
replicates  of  each  configuration  were  planned  for  a  total  of  18  tests.  A  "pseudorandom"  design  was  planned 
in  two  blocks.  Three  replicates  from  each  configuration  made  up  a  block.  The  design  is  considered 
pseudorandom  because  the  first  block  would  be  completed  before  the  start  of  the  second  block.  This 
approach  provided  a  checkpoint  on  the  efficiency  of  the  potential  vehicle  modifications.  At  the  hallway 
point,  a  cursory  analvsis  was  performed  to  determine  if  the  tested  modifications  reduced  the  vehicle  COHb 
levels  as  desired. 


5.b.  PARAMETRIC  ANALYSIS 

Analysis  of  variance  is  robust  to  the  assumption  of  normality,  however,  it  is  not  robust  to 
heterogeneity  of  variance.  Therefore,  Cochran’s  test  was  implemented  on  the  variances  of  the  three 
configurations.  The  results  indicated  no  significant  difference  among  the  variances;  hence,  the  assumption 
of  homogeneity  of  variance  appeared  justified. 

At  the  a  =  0.10  level  of  significance  chosen  by  the  working  group,  the  critical  F-value,  F0 10(2,6), 
equals  3.46.  As  a  result  of  the  ANOVA,  the  F-statistic  equaled  1.00.  Since  the  F-statistic  was  less  than  the 
critical  value,  we  failed  to  reject  the  null  hypothesis  of  equal  means.  This  implied  that  within  the 
constraints  of  the  statistical  test,  neither  the  Fan  Mod  nor  BIGGRS  provided  significantly  lowered  COHb 
levels  than  the  baseline  configuration. 

A  plot  of  the  percent  COHb  obtained  from  each  crew  position  in  each  vehicle  configuration 
tested  is  shown  in  Figure  2.  Here  we  observed  that  each  configuration  had  at  least  one  sample  point  close 
to  or  greater  than  the  limit  specification  of  10%.  Figure  3  shows  the  mean  vehicle  COHb  value  and 
confidence  interval  for  each  configuration.  The  overlapping  intervals  indicate  that  the  mean  COHb  levels 
are  not  significantly  different.  The  estimate  of  the  standard  deviation,  Sp,  used  to  construct  the  confidence 
intervals  is  the  square  root  of  the  mean  squared  error  from  the  ANOVA,  which  is  1.40. 
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Figure  2.  Percent  COHb  by  Position 


5.c.  NONPARAMETRIC  ANALYSIS 

In  addition  to  the  parametric  analysis,  the  Kruskal-Wallis  Nonparametric  Test  was  performed  on 
the  ranks  of  the  data.  Since  three  tests  were  performed  on  each  configuration,  a  total  of  nine  data  points 
(nine  COHb  values)  were  ranked,  with  the  lowest  receiving  a  rank  of  one.  The  test  statistic,  T,  computed 
as  described  by  Conover4  is  3.29.  This  value  is  less  than  the  exact  1  -  a  (0.90)  quantile  which  equals  4.6. 
Therefore,  as  with  the  parametric  analysis,  we  would  fail  to  reject  the  hypothesis  that  the  configurations 
have  equal  mean  COHb  values. 


6.  ADDITIONAL  MODIFICATIONS  CONSIDERED 

The  conclusions  in  Phase  I  forced  consideration  of  alternative  vehicle  modifications.  No  growth 
potential  appeared  to  exist  for  the  BIGGRS  modification;  therefore,  it  was  eliminated  from  further 
consideration.  A  comparison  of  the  baseline  and  Fan  Mod  data  collected  during  Phase  I  indicated  that  the 
mean  COHb  level,  at  each  position  and  for  the  vehicle,  was  reduced  from  the  baseline  levels  during  the  Fan 
Mod  configuration  tests.  The  baseline  test  conditions  required  normal  operation  of  both  the  driver  s  and 
the  rear  hull  fans;  the  Fan  Mod  configuration  required  normal  operation  of  the  driver’s  fan  and  increased 
operating  time  of  the  rear  hull  fan.  The  increased  operating  time  of  the  rear  hull  fan  increased  the  airflow 
within  the  test  vehicle  from  that  of  the  baseline  configuration.  These  observations  led  to  the  hypothesis 
that  the  vehicle  COHb  level  could  be  reduced  further  by  increasing  the  airflow  within  the  test  vehicle. 

Two  options  were  recognized  for  increasing  the  vehicle  internal  airflow.  The  First,  hereafter 
referred  to  as  the  Dual  Fan  (DF)  modification,  increased  the  airflow  by  changing  the  operating  pattern  of 
both  the  rear  and  the  driver’s  hull  fans  as  previously  illustrated  in  Figure  1.  Both  fans  were  activated 
simultaneously  at  trigger  pull  and  remained  operating  for  a  minimum  of  one  minute  fifteen  seconds  after 
trigger  release. 

The  second  option,  hereafter  referred  to  as  the  Reversed  Fan  (RF),  activated  both  the  rear  and 
driver’s  hull  fans  at  trigger  pull  and  physically  reversed  the  driver’s  hull  fan.  The  reversal  of  the  fan 
directed  the  airflow  from  the  interior  of  the  vehicle  to  the  exterior. 

Three  replications  each  of  the  Reversed  Fan  and  the  Dual  Fan  modifications,  within  TRADOC 
Events  12  &  15,  were  recommended  for  Phase  II  testing.  Subsequent  to  the  completion  of  Phase  II,  it  was 
proposed  that  the  vehicle  modification  that  appeared  to  offer  the  greatest  potential  for  reducing  the  vehicle 
COHb  level  below  the  desired  6.1%  limit  be  tested  at  least  three  additional  times  within  Phase  III. 


7.  PHASE  II  TESTING  AND  ANALYSIS 
7.a.  PARAMETRIC  ANALYSIS 

Three  replicates  each  of  the  Dual  Fan  and  Reversed  Fan  modifications  were  tested  during  Phase 
II  within  TRADOC  Events  12  &  15.  An  ANOVA  was  performed  on  both  Phase  I  and  Phase  II  data.  Since 
the  F-statistic,  3.35,  was  greater  than  the  critical  F-value,  F0 M  (4,10)  =  2.61,  the  hypothesis  of  equal  means 
was  rejected.  To  determine  which  means  were  different,  the  90%  confidence  interval  for  each  mean  was 
plotted  with  Sp,  the  pooled  estimate  of  the  variance,  equal  to  1.23.  Failure  of  the  confidence  intervals  to 
overlap  indicates  that  the  means  of  the  associated1  modifications  are  significantly  different  from  each  other. 
The  horizontal  dotted  line  in  the  figure  provides  a  visual  reference  for  the  lower  confidence  bound  for  the 
mean  of  the  baseline  configuration.  If  a  confidence  interval  falls  below  the  dotted  line,  the  mean  of  the 
associated  modification  is  significantly  different  from  the  baseline  configuration. 

The  confidence  intervals  in  Figure  4  show  no  significant  difference  among  the  tested  “modified" 
vehicles;  however,  the  confidence  intervals  for  both  the  Dual  Fan  and  the  Reversed  Fan  mods  fall  below 
the  lower  confidence  bound  of  the  baseline.  This  observation  led  to  the  conclusion  that  both  the  Dual  Fan 
and  the  Reversed  Fan  modifications  were  significantly  better  than  the  baseline  in  reducing  toxic  fumes 
within  the  test  vehicle.  However,  because  the  observed  vehicle  COHb  mean  of  the  Reversed  Fan  data  was 
(1)  less  than  the  observed  vehicle  COHb  mean  of  the  Dual  Fan  data  and  (2)  less  than  the  predetermined 
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7-b.  NONPARAMETRIC  ANALYSIS 
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The  mean  COHb  values  and  standard  deviation  for  each  configuration  in  the  three  phases  are 
listed  in  Table  2.  Since  the  test  for  homogeneity  of  variance  could  not  be  rejected,  ANOVA  was  performed 
on  all  the  data  with  the  degrees  of  freedom  adjusted  to  account  for  the  second  set  of  three  baseline  and 
three  reverse  fan  tests  conducted  several  days  after  the  initial  tests  in  Phase  I. 

Table  2.  Phase  I  -  III  Statistics  for  Each  Configuration 


Configuration 

n 

Mean 

Std.  Dev. 

Baseline 

6 

9.5 

1.02 

Rear  Fan 

~3 

7.9 

1.82 

BIGGRS 

3 

8.1 

1.37 

Dual  Fan 

3 

6.6 

1.27 

Reverse  Fan 

6 

5.5 

<  0.97 

Driver’s  Fan  Off 

4 

7.6 

0.51 

The  90%  confidence  intervals  about  the  means  are  presented  in  Figure  5  for  each  vehicle 
configuration  tested  within  the  program.  The  pooled  estimate  of  the  variance  used  to  construct  the 
confidence  intervals  was  Sp  =  1.13.  Figure  5  reveals  that  at  90%  Confidence  Level  the  Dual  Fan  Reverse 
Fan  and  Fan  Off  modifications  are  significantly  different  than  the  baseline  vehicle.  A  similar  plot  drawn 
for  the  99%  confidence  intervals  about  the  means  is  shown  in  Figure  6.  Here  only  the  Reverse  Fan 
modification  is  significantly  different  from  the  baseline  vehicle.  However,  sample  size  was  a  contributor  to 
the  relative  lengths  of  the  intervals  and  the  conclusions  drawn.  To  answer  the  question  What  conclusions 
would  have  been  drawn  if  the  sample  size  of  the  Dual  Fan  and  Fan  Off  modifications  were  equaled  to  that 
of  the  Reverse  Fan  (six)?"  We  conclude  from  the  dotted  lines  about  the  means  shown  in  Figure  6  that  i 
the  sample  size  were  increased  to  six  for  both  the  Dual  Fan  and  Fan  Off  modifications  (assuming  the  mean 
were  no  greater  than  the  mean  calculated  in  the  sample  of  size  three  and  that  the  variance  remained  the 
same)  the  Dual  Fan  would  have  been  determined  to  be  significantly  different  than  the  baseline  vehicle, 
however,  the  Fan  Off  modification  would  not  have  been  significantly  different  from  the  baseline  vehicle. 


8.b.  NONPARAMETRIC  RESULTS 

Examining  the  results  of  the  Kruskal-Wallis  test  with  all  the  data  from  each  of  the  three  phases, 
the  T-statistic  is  significant  at  the  a  =  0.01  level.  Multiple  comparisons  were  then  performed  on  the 
average  rank  of  each  vehicle  configuration.  Figure  7  shows  the  pairwise  results  for  a  =  0.10, 0.05  and  0.01 
levels  For  each  level  of  significance,  a  line  begins  at  each  configuration  until  a  significant  difference  occurs. 
For  example,  referring  to  the  a  =  0.10  level,  the  average  rank  for  the  baseline  vehicle,  BL,  is  significantly 
different  from  all  the  other  configurations.  This  is  indicated  by  the  horizontal  line  breaking  when  it  reaches 
the  next  lowest  average  rank,  BG.  Then  the  line  continues  at  BG  thru  DF,  meaning  that  BG  is  not 
significantly  different  from  FM,  FO,  or  DF.  However,  BG  is  significantly  different  from  RF;  therefore  the 
line  does  not  continue.  A  line  beginning  at  the  Fan  Mod  is  not  shown  since  its  line  is  a  subset  of  the 
previous  line.  That  is,  the  average  rank  for  FM  is  not  significantly  different  from  FO  or  DF  but  it  is 
significantly  different  from  RF.  Likewise,  a  line  is  not  drawn  for  FO,  because  its  line  is  also  a  subset  of  the 
line  for  BG  Indicating  FO  is  not  significantly  different  than  DF,  but  is  significantly  different  from  RF. 
Lastly,  a  horizontal  line  is  drawn  under  DF  and  RF  to  represent  that  there  is  no  significantly  difference 
between  the  average  ranks  of  the  two  modification  at  the  a  =  0.10  level. 

Similarly,  the  results  are  shown  for  a  =  0.05  and  0.01.  At  a  =  0.01,  DF  and  RF  are  not 
significantly  different  from  each  other,  but  both  are  significantly  different  from  the  baseline  vehicle. 
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igure  5.  90%  Confidence  Intervals  on  Vehicle  Means,  Phases  I 

Sp  =  1.13 


Figure  6.  99%  Confidence  Intervals  on  Vehicle  Means,  Phases  1 
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Figure  7.  Nonparametric  Pairwise  Results 
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9.  TOLERANCE  LIMITS 

The  analysis  up  to  this  point  was  concerned  with  significant  differences  among  the  means.  We 
discussed  at  the  beginning  of  this  paper  the  large  sample  size  required  if  we  were  to  guarantee  that  a  large 
percent  of  the  population  were  to  remain  under  10%  with  a  very  high  level  of  confidence.  However,  we  will 
report  on  the  tolerance  Umits  for  the  normal  distribution  expected  from  this  small  sample  of  six  and  three 
for  the  Reverse  Fan  and  Dual  Fan,  respectively  (see  Table  3).  The  nonparametric  tolerance  limits  for 
sample  sizes  this  small  will  have  much  lower  values  either  in  confidence  level  and/or  population 
proportion.  The  exact  values  are  being  investigated. 

Table  3.  One-Sided  Tolerance  Limits  for 
the  Normal  Distribution 


Configuration 

S? 

H 

7=  .95 

T  =  .99 

Reverse  Fan 

P  =  .98 

P  =  .96 

P  =  .86 

Dual  Fan 

P  =  .82 

P  <  .75 

— 

7  c  confidence  level  that  P  proportion  of 
the  population  will  have  a  COHb  <  10% 


10.  RECOMMENDATIONS 

Testing  of  the  modified  and  unmodified  Bradley  Fighting  Vehicle  within  TRADOC  Events  12  & 
15  and  subsequent  analysis  of  the  collected  data  leads  to  the  following  recommendations: 

(1)  that  the  Reversed  Fan  modification  be  considered  the  primary  solution  to 
the  Bradley  Fighting  Vehicle  toxic  fumes  acceptance  test  problem. 

(2)  that  the  Dual  Fan  modification  be  considered  the  secondary  solution  to 
the  Bradley  Fighting  Vehicle  toxic  fumes  acceptance  test  problem. 


175 


MMO^  M2A2  vehicles  and  with  the 

hypotheses  formed  within  this  test  pro^rtT  °btained  and  the 

modification  Jehid^rn^55  °f  the  Reversed  Fan 

specification,  during  the  perform  ancerfS,  .COHb  lcvek  below  the  limit 
initiated.  *  penormance  of  the  tramrng  and  combat  scenarios,  be 

(6)  that  vehicle-, o-vehicle  variability  be  hwesfigated 

POSTTEST  PROGRAM  ACTIONS 

•s  effective  as  IheSmaSlSnMd f'*  j'eh''"8  Vchides  concluded  that  the  Dual  Fan  M  d 

references 

1.  Lyon,  David  H.  and  Dennis  C  fCpIhcam  b  j, 

3  rdZl759A’  H'""m  F°a™  £n*me""g '  »  *■»•  1*1. 

3-  DF,  AMXHE-cc  26  February  1985,  Subject:  Proposed  revisions  to  MIL-HDBK-759A. 

4.  Conover,  W.  J„  Practical  NonparameW  Statics,  2nd  edition,  John  Wiley  &  S 

5.  Krk  Roe  pp  .  ~  ^  jonn  wiiey  &  Sons,  Inc.,  1980. 

Brooks/Cole  2nd  edition, 


176 


SOME  LIMITATIONS  OF  THE  RANK  TRANSFORMATION  TEST 
FOR  INTERACTION 


W.  J.  CONOVER 

College  of  Business  Administration 
Texas  Tech  University 
Lubbock,  Texas  79409 


ABSTRACT.  The  rank  transformation  is  used  widely  to  convert  parametric 
tests,  such  as  the  t-test  and  the  F-test,  to  nonparametric  tests  such  as  the 
Vilcoxon  test  and  the  Kruskal -Wallis  test.  It  is  also  widely  used  in  experimental 
designs  to  convert  analysis  of  variance  procedures  to  robust  procedures  that  have 
superior  power  in  some  cases.  As  a  test  for  interaction,  the  rank  transformation 
can  lead  to  a  test  that  is  not  valid.  Some  discussion  of  the  limitations  of  this 
use  of 'the  rank  transformation  is  given  in  this  paper. 

1 .  INTRODUCTION.  The  Rank  Transformation  Methods  refer  to  standard 
classical  statistical  procedures  that  are  applied  to  the  ranks  of  the  data  rather 
than  to  the  data  themselves.  The  following  two  steps  are  involved. 

Step  1:  Replace  the  data  by  their  ranks,  from  rank  1  for  the  smallest,  to  rank 
K  for  the  largest  observation. 

Step  2:  Use  a  standard  statistical  procedure,  such  as  the  t-test  or  analysis  of 
variance  F-test,  on  the  ranks. 

Example :  Three  steel  mills  are  being  monitored  for  the  amount  of  smokestack 
contaminants  to  see  if  there  is  a  difference  in  mean  level  of  contamination.  Five 
randomly  selected  times  for  observation  lead  to  the  following  measurements. 


Factory  A 

Factory  B 

Fagtoyy...c. 

46.3 

(6) 

48.6 

(8) 

45.1 

(5) 

43.7 

(4) 

52.3 

(13) 

46.7 

(7) 

51.2 

(12) 

50.9 

(ID 

41.8 

(2) 

49.6 

(10) 

53.6 

(14) 

40.4 

(1) 

48.8 

(9) 

55.7 

(15) 

42.6 

(3) 

The  classical  F  statistic  computed  on  the  data  gives 

_,  SST/  (Jc-1)  _  198.1/2  =  -  3  2? 

SSE/  (N-k)  89.6/12 


which  is  compared  with  the  F  distribution  with  2  and  12  degrees  of  freedom. 
Because  the  upper  .05  quantile  is  3.885,  the  observed  value  is  significant. 


The  classical  F  statistic  computed  on  the  ranks,  given  in  parentheses,  gives 


F  _  SSTl  (lc-1) 
SSE/ (N-k) 


185.2/2  _  i i  *7o 

94.8/12  '  11,72 
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In  some  cases  the  Rank  Transform  Procedure 

r°CedUre  ls  a  PQnParen,errf.  test 

Cl  eeef..1 


Classical 
— Erocednrf> 

The  t-Test  on 
2  Independent 
Samples 

One-way  Analysis 
of  Variance 

Correlation  Test 
for  Bivariate 
Independence 


Rank  Transform 
-  Proftwfljrft 

The  Wilcoxon 
Rank  Sum  Test 


The  Kruskal -Wallis 
Test 

Spearman's  Rho 
Test  for 
Independence 


In  other  «...  thl.  r..ult.  _  .  ^ 

The  Randomized  Conpl.t.  Block  Analy.i.  of  v.rl(mo. 
The  Balanced  Inconpl.t.  Block.  „( 

Analy.ia  of  Variance  Interaction 

ssrss 

«JW 

_  '  l*  2'  3'  4  (4  treatments) 

j  ■  1#  2,  3  (3  blocks ) 

*  -  1-  2.  3,  4,  5  (s  observations  per  cell ) 


interaetlon^eomput**!*  on* rank/  Slta^dT1”  <Tu  “  °>  the  F  .t. 

two  cases  studied:  '  ollowed  the  usual  F  distribution  clos 


for 

the 
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Case  1;  0i  -  0,  /9j  ■  0  (No  treatment  or  block  effects) 

Case  2:  oA  -  0,  -  02  “  0.  03  “  1  <No  treatment  effects) 

This  provided  the  basis  for  suggestions  to  use  the  rank  transformation 
suggestions  in  papers  by  Conover  and  Iman,  and  suggestions  in  the  users  manuals 
of  £AS  and  IMSL. 

3 .  RECENT  BF.STTT.TS .  Later  simulation  results  (Blair,  Sawilowsky  and  Higgins, 
1987)  showed  that  under  some  extreme  conditions,  in  the  presence  of  both  block 
and  treatment  effects,  the  performance  of  the  rank  transform  test  for  interaction 
could  be  both  non- robust,  and  lacking  in  power.  They  examined  the  following 
case,  and  varied  the  constant  c  as  described  in  the  table  below. 

Case  1:  ox  -  a4  -  0,  a2  -  c,  o3  -  -c  (treatment  effects) 

0i  -  c,  p2  -  *c.  03  -  0  (block  effects) 

“  0  (no  interaction,  the  null  case) 

The  linear  model  used  is  the  same  one  used  by  Iman,  and  given  above.  The  error 
terms  were  taken  to  be  standard  normal,  and  the  number  of  observations  per  cell 
was  n,  which  was  varied  along  with  £  as  described  in  the  following  table.  The 
entry’ in  this  table  is  an  estimate  of  the  true  level  of  significance,  at  a 
nominal  alpha  level  of  .05,  obtained  by  simulation  with  1000  runs. 


c  -  0.5 

n  -_2 
.056 

n  -.-5 

.049 

n  r.JLQ 
.053 

n  -  20 
.050 

h  --5Q 
.054 

1.0 

.046 

.053 

.073 

.101 

.193 

•  1.5 

.053 

.076 

.132 

.309 

.848 

2.0 

.044 

.105 

.326 

.803 

1.000 

2.5 

.053 

.186 

.682 

.997 

1.000 

The  table  shows  that  for  reasonably  large  shifts  in  treatment  effects,  the  rank 
transformation  test  for  interaction  is  not  robust,  even  for  fairly  small  sample 

sizes. 

These  results  inspired  a  study  of  the  theory  behind  this  test,  by  Thompson 
(1991).  She  assumed  the  model  was  linear,  as  above,  and  found  that  the  ran 
transformation  test  for  interaction  was  valid  if  and  only  if  at  least  one  main 
effect  (blocks  and/or  treatments)  was  ppt  pregept.  Further,  she  found  that  it 
both  effects  were  present,  the  mean  of  the  F  statistic  on  ranks  has  a  term  that 
increases  without  bound  as  Q  increases,  thus  forcing  a  to  1.0. 

Choi  (1991)  found  similar  results  for  the  general  model 


Xijk  ~  •F<*  "  °i  “  Pj  "  Y«) 


This  explains  why  Iman  (1974)  detected  no  problem  with  the  rank  transform  test 
for  interaction,  because  he  looked  only  at  cases  where  one  main  effect  was  zero, 
and  why  Blair  et.al.  (1987)  found  serious  problems,  because  they  reported  only 
cases  where  both  main  effects  were  present. 
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4. 


ProcetoeP{f^™'®  -  '°"clusI°n  of  this  p,pcr  lt  th 

*"?  “*  *«  thet  It  LldTlJi?  “ftottta 

exists.  gbtlessly  to  .very  an.ly.1.  tor  Wth  “«• 

CA*8sic«l  procedure 

l  RZFEBPyrjpp 

the  Rank  Transform  Statistic  7  ?*d  HiSS*ns ,  J.j.  (1987)  L1 
SXt0lijL"  «(*).  H33-1U4  "  Te*tS  for  interaction,  °f 

Iman,  R.l.  (1974)  »  B  y* 

«io" 

Thompson,  G.L.  Q99n  a  „ 

Mometrjjpp,  78  (3)_  697- 701  *  °"  the  Rank  Transfor®atlon  for  Interactions 
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ON  A  NEW  SYSTEM  OF 
MULTIVARIATE  DISTRIBUTIONS 


Major  Kevin  M.  Beam 
U.S.  Army  TRADOC  Research  Associate 
RAND 

1700  Main  Street,  P.O.  Box  2138 
Santa  Monica,  CA  90407  -  2138 

Albert  S.  Paulson 
School  of  Management 
Rensselaer  Polytechnic  Institute 
Troy,  NY  12180  -  3590 


ABSTRACT.  There  has  been  substantial  interest  in  multivariate  probability 
distributions  with  given  margins  since  Galton's  (1885)  investigations.  For  a  multivariate 
system  to  be  useful,  we  require  it  to  be  flexibly  constructed  and  easily  used  in  the  modeling 
of  multivariate  data.  Furthermore,  it  should  possess  computational  ease  and  be  intuitively 
appealing.  Its  parameters  should  represent  important  physical  properties,  e.g.,  measures  of 
scale,  location,  shape  and  correlation.  We  propose  such  a  multivariate  system:  the  Diagonal 
Perturbation  System. 

We  construct  the  system  as  a  multivariate  refinement  of  the  intuitively  appealing 
framework  of  the  Neyman  alternative.  We  demonstrate  the  system's  flexibility  by 
presenting  several  univariate  and  bivariate  models,  and  bivariate  constructions  in  which 
the  margins  are  univariate  versions  of  the  system.  Additionally,  we  find  the  system's 
parameters  to  represent  physical  properties.  We  show  the  system  to  be  readily  implemented 
numerically.  We  demonstrate  the  system’s  utility  through  the  successful  modeling  of 
multivariate  data  that  has  eluded  fitting  for  over  a  decade.  We  employ  nonlinear 
minimization  to  produce  the  least  squares  parameter  estimates  while  capturing  not  only  the 
usual  sum  of  squared  errors  but  also  the  margin’s  first  two  moments  and  the  first  mixed 
moment. 


1.  INTRODUCTION.  There  has  been  an  interest  in  multivariate  distributions  with 
given  margins  for  several  years.  These  include  the  systems  of  Morgernstern  (1956),  Gumbel 
(1960,1961),  and  Farlie  (1960);  Sibuya  (1960);  Plackett  (1965);  Ali,  Mikhail,  and  Haq  (1978); 
Frank  (1979);  Clayton  (1978),  Cook  and  Johnson  (1981),  Clayton  and  Cuzick  (1985);  and 
Marshal)  and  Olkin  (1988).  For  a  general  discussion  see  Mardia  (1970a),  Johnson  and  Kotz 
(1972),  and  Johnson  (1987).  We  propose  a  system  of  distributions  which  is  flexibly 
constructed,  intuitively  appealing  and  easily  used  in  the  fitting  of  multivariate  data. 

Neyman  (1937)  proposed  for  the  density  of  any  random  variable,  X,  the  alternative; 

k 

f(x)  =  c  exp[(]£  0j  lj(x)];  0  £  x  £  1  and  k=l,2 . 

j=i 

The  lj  are  Legendre  polynomials;  0j  are  parameters;  and  c,  a  function  of  the  0j,  is  a 
normalizing  constant.  We  construct  our  system  as  a  multivariate  refinement  of  this 
intuitively  appealing  framework. 
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const ructeJ  fromX  *"d  “>"-der  examples  when 

margin  preserving  bivariate  constrictions  atain  e^T'  Consider  tw°  analogous 

gaussian  distributions.  The  bivariate  distriS*-  con^™cted  from  the  uniform  and 
consider  a  third  bivariate  constructbnwhCh  diesnlTf  h  unifo™  mar^ns  are  copulas.  We 
example  of  the  system's  flexibility,  we  consider  a  hijf  P?Serve  the  margins.  As  a  final 
margins  are  univariate  versions  of  the  system  W*  l  146  f  "struction  in  which  the 
sets  and  give  direction  to  future  research.  successfully  model  two  bivariate  data 


2.  THE  DIAGONAL  PERTURBATION  SYSTFM  T  fv  v 
variables  (r.v/s)  with  margin,,  distribution  functions  (d.f„  rand“” 

_  '* 0i^Ai^..-.,©n(Xn);  survival 

Junctions  (s.f.’s)  ft.fV  i  a  rv  \  .  - 

(p.d.f.’s),  if  continuous1-  or  probability  ma^?^  Pr°babiHty  density  functions 

The  system  is  constructed  as  *  discre*>  W, . W 


_  [fj  ©jCXj)]  expfAfQjOq)]};  i=l,...,n. 


(1) 


Here  n  denotes  product  and  Ate,<X,»  is  a  function  of  the  s.f.s.  We  now  suppress 
8~  °f  thC  *»--■  -  0.  we  beve  the  product  of  tb, 

distributions,  the  uncorrelated  case.  IfAffl.Wn  ,  u 

the  variance-covariance  structure  The  notlf  •  ’  a  &  ■  ***  corre*at*on’  that  is,  changes  in 

oture.  The  notation,  ^  ,»  appropriate  as  we  have,  in  ,  sense  , 

change  in  the  product  oftbc  marginal  distributions  as  Ate,,  varies  from  aero. 

3  CONSTRUCTIONS  CONSIDERED.  Although  there  are  infinite  of  (1) 

function.  F  is  the  diagoM^pertu^tioi^ dTora^nd^^tlf  °nS-  WherC  A<§i>  “  *  P01*""™1 

«  S  andY,  C  is 


Univariate 
F  =  G  exP(apGp)  (2) 

F  =  Gexp(a1G  +  a2G2)  (3) 


Bivariate  Construct;^ 


F  =  GH  exp[ap(GH)p] 


F  =  GH  expfajGH  +  a2(GH)2] 


(4) 

(5) 


F  -  GH  exp(a10G  +  a^H  +  ctnGH)  (6) 
BiVariate  ““"S  <4)  margins  as  /f  dy  ,  g,  where 

Trr  th; ^  ->*-*<»  -*** 

an<1  P=  >■  "  version  (3) if  c,  =  „M  >nd ^  =  „ 
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4.  UNIVARIATE  CONSTRUCTIONS. 

4.1.  Examples  of  the  First  Univariate  Construction.  The  density  for  (2)  is 

f  =  (1  -  potpG^Qg  exp(apGp).  (7) 

For  (2)  to  be  a  d.f.  we  require: 

a)  F(-  co)  =  0,  b)  F(oo)  =  1,  and  c)  F(x+h)  *  F(x)  for  h  2s  0. 

The  first  two  requirements  are  immediate  and  the  final  requirement  reduces  to  (1  £  ctp  <  a>); 
(-  ao  <  ap  £  1)  for  p  =  1,  or  (-  co  <  ap  £  (^j)Pl)  for  p  >  1.  We  see  limap  =  e,  the  natural 

exponent. 

If  G  is  the  uniform  d.f.  on  [0,1],  f  =  (1  -  papd-x)1**1  x)g  exp(ap(l-x)p).  If  p  =  1,  we  have 
11 

E(Xn)  =  1  -  —  T—  e^-  V  '  — S^l,  where  lim  E(Xn)  =  — ? ,  the  nth  moment  for  X~U(0,1); 
aiLa“  >  (n-i)!  a\J  n+1 

i=0 

and  Var{X)  =  (a?  +  2a,  - 1  +  2a?e°1  -  2a1e“1  +  2ea*  -  e2*1)  /  ,  where  lim  Var(X)  = 

cij  ->  o 

Various  plots  of  (7)  constructed  from  U(0,1),  where  p  =  1,  are  given  in  Figure  1. 
Figure  2  shows  the  effect  of  ax  on  p,  a2,  ft,  and  y2.  where  Yi  is  the  coefficient  of  skewness, 

M M2>3/2  ,  and  y2  is  the  coefficient  of  kurtosis,  PjAp-*)2  ;  where  m  is  the  i*  moment  about  p. 
As  a j  increases  we  see  p  decreasing,  a  small  decrease  in  c2,  a  large  increase  in  Yi  and  a 
decrease  then  increase  in  y2.  Further  analysis  of  the  effect  of  the  parameter  a^  on  the 
moment  ratios  is  presented  in  Figure  3  where  we  see  (7)  graphed  in  the  Pj  -  P2  plane  for 

varying  values  of  ax  where  px  =  y?  and  P2  =  Y*  ^  Pi  -  P2  Plane  is  presented  for  reference 
in  Figure  4  where  the  equations  for  the  bounding  curves  and  the  location  of  the  Pearson 
densities  are  taken  from  Pearson  and  Hartley  (1970).  As  expected,  (p,a2,Yi»Y2l =  (2»i2*®*^*®^ 
when  aj  =  0.  We  see  extreme  values  of  aj  effecting  J-shaped  beta  type  I  densities. 

As  another  example  of  (7)  constructed  from  the  uniform  distribution,  we  consider 
p  =  2.  Graphs  of  (7)  are  presented  in  Figure  5  for  various  values  of  a2.  The  effect  of  a 2  on 

the  moments  and  moment  ratios  is  presented  in  Figure  6  and  Figure  7.  We  see  more 
extreme  distortion  of  the  density.  Of  real  interest  is  the  U-shaped  beta  type  I  densities  for 
a2  >  0  and  beta  type  II  densities  for  a2  <  0. 

If  (7)  is  constructed  from  N(0,1),  the  expectations  are  not  tractable  in  closed  form 
and  quadrature  is  required  for  all  results.  For  p  =  1  we  again  see  positive  skewness  for 
positive  values  of  ax  and  negative  skewness  for  negative  values  of  ax  in  Figure  8.  The 

moments  and  moment  ratios  are  presented  in  Figure  9.  As  expected,  (p,c2,Yi,Y2)  =  (0,1, 0,3) 
when  aj  =  0.  We  see  o2  decreasing  as  a1  changes  from  0.  The  effect  of  on  the  gaussian 
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®*tai  ”  '00  f(«>  f(«> 


u  1.4  14  14  14 


M  "  u  U  -U-U-U-U-U  I  U  M  14  UU 

1  I 

l  Density  (7)  Constructed  from  U(0,1);  FIGURE  2  Moments  and  Moment  Ratios  of 
p  =  1;  ax  =  1.0,  0.0,  - 1.0  Density  (7)  Constructed  from 

U(0,1)  as  a  Function  of  a. 


©V 


FIGURE  3  Ihe  Effect  of  on  Density  (7) 
Constructed  from  U(0,1)  in  the 
Pi  •  P2  Plane 


FIGURE  4  The  Pi  •  P2  Plane  with  Locations  of 

Some  Common  and  Pearson-Type 
Distributions 


MOMENTS 


1.2  0.4 


FIGURE  5  Density  (7)  Constructed  from  U(Otl); 
p  =  2;  02=2.0, -2.0, -4.0 


-«  -tl  -L2  -U  4.4  I  14  M  12  U  2.1 
l|ph  2 

FIGURE  6  Moments  and  Moment  Ratios  of 
Density  (7)  Constructed  from 
U(0,1)  as  a  Function  of  02 


FIGURE  7  The  Effect  of  02  on  Density  (7) 
Constructed  from  U(0,1)  in  the 
1  ftR  Pi  -  P2  Plane 


^  ^  P*ane 's  ex®luded  as  there  is  very  little  movement  away  from 

0G;;phs  -*»*.  whe„ , ..  2 

of  are  omitted  a»  one  can  ascertain  from  the  Zl  lZ  °r8phs  f°r  “**«"  » Line, 

^XamP*e8  Socond  Univariate  Construction.  „  density  for  »  „ 
f  -  (1  -  ttjG  -  2ot2GG)g  exptajG  +  a2G2J. 

For  (3)  to  be  a  d  f  a«nn  ^  .  {8) 

requirement,  from  Sect^  4.1.  mduceste  Z7a "<ly 7l  7  8nd  the 

The  densities  possible  by  construction  of  (*\  -1  l  1  d  ®  *  a2  *  1  *  aj/2  +  JT^T) 
maximum  values  of  u2  are  considered  ate  esWbl’te'd  inT’*8"  de,Kity  when  »"'?  ‘  ' 

encompasses  near,,  one-haif  of  the  admits  ".X'  *,  T  *"  *— 

6  oi  me  pj  -  p2  p]ane  presented 

5.  BIVARIATE  constructions. 

5X  Examp,eSuf,hePir,tBivuriute construction.  The  density  for  „>  is 
f-  (1  r  Pop(GH)f  W -GH-GH.  papGH(GH)pJ}  gh  exp[np(GH)p).  (9) 

The  sufficient  conditions  for  (4)  to  be  a  d  f  r vr  j  - 
a)  F(oo,oo)  =  1  *  )  to  be  a  d.f.  (Mardia,  1970a)  are 

C)  ^*°°'^  =  F(x,-oo)  =  0,  and  $ 

It  is  clear  the  d.f.  satisfies  the  r  m  ,y+h  ‘  F(x*h.y)  •  F(x,y+h)  ;*  0. 

conditions  on  the  parameters  p  IStutT"'  ’J*  'i"8'  conditi»"  "duces  to 

7“>8  7  ®  «■  ^metric.  the  minimumvalue  “  ®  -  » 

f  *  fU"Cti°n  °f  P  “i8'id8S  »ith  »-  univariate  eonTtruct!.  IVTf^  ^1“  ^ 

K  )-  "  y  =  *.  the  lower  limit 

r  lB2x2+4DX2.arv^  ft||  (  . 


2px(l-x)2P 
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H.0  -3.0  -tO  -10  0  ID  tO  3.0  4.0 


-4.0  -3.0  -10  -10  I  U  10  3.0  4J 


-4.0  -10  -2.0  -tO  0  tO  10  10  4.0  ~-L0  -OJ  -OJ  -0.4  -Oi  0  Oi  0.4  Oi  OJ  10 

i  Alpha  1 

FIGURE  8  Density  (7)  Constructed  from  N(0,1);  FIGURE  9  Moments  and  Moment  Ratios  of 
p  =  1;  04  =  1.0, 0.O,  -  2.0  Density  (7)  Constructed  from 

N(0,1)  as  a  Function  of  04 


-4.0  -3.0  -2.0  -tO  0  tO  2.0  3.0  4.0 
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FIGURE  10  Density  (7)  Constructed  from 
N(0,1);  p  =  2;  04  =  2.0, 1.5, 1.0 
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FIGURE  11  Moments  and  Moment  Ratios  of 
Density  (7)  Constructed  from 
N(0,1)  as  a  Function  of  04 


N 

< 


FIGURE  UTh.Effect  of  a,  end  c,.,,  m 

Constructed  from  N(0,1)  in  the  f, .  h  ^ 


C  +  B/C 

Here  the  minimum  value  of  f  is  at  x  =  |  a|  J  -  “jf  .  where  A  = 


2Dj  -  9DiD2  -  27/D0 
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Di  -  3Do  0  5p2 + 4p  , 

B  =  -  V—  ,  and  C  =  [(A2  -  B3)1'2  +  \M]m.  Here,  D0  =  P3  +  4p2,  D1  =  -  Dq  ,  and 
y 

1/2 

D„  =  4p  — .  We  see  lim  ap  =  - 1  and  lim  ap  =  2  -  P  +  2ff  e?  =  '  646-7719769-’ 

2  D0  p  «*>  1  p  p  ->  oo  r 

where  f,  =  [g  e  [f  -  S2|2!]M+ 1  (Myers,  1990). 

We  give  an  example  of  (9)  with  gaussian  margins  in  Figure  14.  We i  see  Mo-rtfe* 
contours  unlike  those  of  the  standard  correlated  bivariate  gaussian  In  Figure  15  we 
present  the  effect  of  p  and  a,  on  the  Pearson  product-moment  correlation  coefficient,  p, 
when  (10)  is  constructed  from  gaussian  margins.  We  see  negative/positive  values  of  u 
producing  negative/positive  values  of  p.  Figures  16  and  17  show  the  *»*"*"•  •"**£ 
Mardia’s  (1970b,  1974)  multivariate  measures  of  skewness  and  kurtosis,  P1>2  a  P2.2- 

expected,  (p.P^M  =  when  aP  =  0  for  a11  valueS  °fp‘ 

5.2.  Examples  of  the  Second  Bivariate  Construction.  The  density  for  (5)  is 

f  =  {l  +  ajCGH  -  GH  -  GH)  -  2a2(G  +  H)GH  +  [a?  +  8a2  +  4aia2GH  +  4(a2GH)2]GHGH} 


gh  exptotjGH  +  a2(GH)  ]. 


GO) 


For  (5)  to  be  a  d.f.,  again,  the  first  three  requirements  are  ^mediate  and  the  final 
requirement,  from  Section  5.1,  yield  limits  on  the  parameters  ax  and  a2.  We  find 
(_lsan  <  1)  and  (a2  <  1  -  ax/2  +  We  see  the  upper  limit  of  a2  as  a  function  of  a, 

coincides  with  that  of  univariate  construction  (3).  As  a  closed  form  for  the  lower  limit  of  a2 
has  eluded  us,  numerical  computations  have  provided  values  for  lower  bounds.  Table  1 
gives  the  minimum  a2  as  a  function  of  representative  values  of  av 


0.4 


0.3  -  0.2  -  0.1  0.0 


—  Tjjj  -0.9  -0.8  -0.7  -0.6  -0.5 

Min  a2  -9.927  -10.223  -10.460  -10.655  -10.815  -10.947  -11.053  -11.138  -11.204  -11.252  -11.284 


Min1a2  -11.302  -11.306  -11.298  -11.278  -11.247  -11.206  -11.155  -11.095  -11.027  -10.951 


0.2 


0.3  0.4 


0.5 


0.6  0.7  0.8  0.9 


1.0 


TABLE  1  Minimum  a2  as  a  function  of  04 

Graphs  of  (10)  constructed  from  U(0,1>  and  N(0,1)  marpns  are  presented  in  Figures 
18  through  21.  Figures  22  and  23  are  contour  graphs  of  r  as  a  function  of  a,  and  Oj  for  t 
uniform  and  gaussian  cases  respectively.  Figures  24  and  25  present  contoum  of  Manila  s 
multivariate  moment  ratios,  fc*  and  fe*  for  (10)  constructed  from  N  0,1).  Again 
(P  B,  ,,B,  ,)  =  (0,0,8)  when  (u,,^  (0,0).  We  see  (10)  to  be  very  flexible,  allowing  for  the 
construction  of  skewed  and  kurtic  surface  with  negative  and  positive  correlation. 
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FIGURE  U  Density  (9,  Con^nieJ  ^  u  " 

„  Con*tnicted  from  N(0,1) 


14  u  u  u  M  "  «  u  a  a  _ 

figure  i<?  p  ^  u  u  n  li  u  u~Ti 

of  P  *nd  a,  for  Density  (9)  OTRE 17  Cont<m'  <*«Ph  of  fc,  M  ,  n,„^fn 

Constructed  from  N(0,1)  a  P  *”i  a,  for  Density  (9) 

Constructed  from  N(0,1) 


FIGURE  18  Density  (10)  Constructed  from  FIGURE  19  Density  (10)  Constructed  from 

U(0,1),  Preserving  the  Margins,  U(0,1),  Preserving  the  Margins, 

(a^Oj)  =  (0.0,- 11.25),  p  =  -  0.52  (Oj.Ojj)  =  (-  1.0,2.90),  p  =  -  0.07 


■“  '<  ■«  •  u  u  U  u 

iUil 

Oj  and  ctg  for  Density  (10) 
Constructed  from  U(0,1) 
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«i  and  02  for  Density  (10) 
Constructed  from  N(0,1) 
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FIGURE  24  Contour  Graph  ofpu„, 

WO]  and  Ho  for  Density  do) 
Constructed  from  N(0,1) 
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FIGURE  26  Contour-Graph  ef^  ^ 

°i  *«d  ®2  f°r  Density  (10) 

Constructed  from  N(0 1) 


5.3.  Examples  of  the  Third  Bivariate  Construction.  The  density  for  (6)  is 

f  =  [1  -  cc10G  -  a01H  +  an(GH  -  GH  -  GH)  +  (a10a01  +  a10anG  +  a01anH  +  a^GH)GH] 


gh  exp(a10G  +  a01H  +  a^GH). 


(11) 


Little  is  known  about  the  values  of  the  parameters  required  to  preserve  the 
nonnegativity  of  (11).  We  present  graphs  of  (11)  constructed  from  U(0,1)  in  Figures  26  and 
27  from  N(0,1),  in  Figure  28  and  from  Exp(l)  in  Figure  29.  We  recall  from  part  3  that  this 
construction  does  not  preserve  the  margins.  We  note  changes  in  the  marginal  means  and 
variances  when  a10  or  a0j  ^  0. 


6  A  Final  Bivariate  Construction.  The  final  bivariate  form  of  (1)  we  examine  is 
a  bivariate  construction,  where  X  and  Y  have  marginal  d.f.'s  (2)  and  joint  d.f.  (4).  We  have 

Fx  =  Gx  exp(«xGx’c)  and  Fy  =  Gy  exp(cxyGyy),  resulting  in  the  bivariate  distribution, 


Fxy  =  FxFyexp[oxy(FxGy)H  (12) 

Several  graphs  of  (12)’s  density  constructed  from  the  gaussian  distribution  are 
presented  in  Figures  30  through  33.  The  robustness  of  this  construction  is  exhibited  in 
these  graphs.  We  are  able  to  construct  both  noncorrelated  and  negative  or  positive 
correlated  surfaces  which  are  skewed  or  non-skewed  and  either  uni-,  bi-,  tri-,  or  quadra- 

modal. 

While  not  presented  in  this  paper,  we  have  investigated  several  forms  of  (9),  (10), 
(11)  and  (12)  with  margins  (7)  and  (8)  constructed  from  Beta,  Cauchy,  Gamma,  Gaussian, 
Laplace,  Logistic,  Uniform,  and  Weibull  densities.  It  was  recreative  to  observe  the  resulting 

surfaces. 

7.  An  Innovative  Approach  to  the  Modeling  of  Bivariate  Data.  Here  we 
exhibit  the  utility  of  the  proposed  system  by  fitting  aircraft  operations  and  maintenance 
data.  Periodically  aircraft  undergo  large  scale  overhaul  programs.  This  presumes  that  the 
aircraft  are  restored  to  a  better  operating  condition.  The  r.v.’s  are  defined  to  be  the  number 
of  aircraft  which  suffer  n  aborts  in  a  six  month  period.  Aborts  are  mission  interruptions 
occurring  during  pre-flight  or  in-flight  operations.  We  consider  the  bivanate  case  of  two 
consecutive  six  month  periods.  The  following  diagram  indicates  that  between  periods  1  and 
2  there  is  no  intervening  overhaul  and  that  an  overhaul  occurs  between  periods  3  and  4. 


Period 

1  2 

_ /  / _ 

3  4 

— >  time 

No 

Overhaul 

Overhaul 

Thus  the  bivariate  r.v.'s  considered  are  A^,  (ij)  =  (1,2)  or  (3,4);  the  number  of  aircraft 
suffering  x  aborts  in  period  i  and  y  aborts  in  period  j.  Table  2  gives  the  data  for  203  aircraft 
in  periods  (1,2).  The  sample  is  obtained  by  considering  the  entire  inventory  of  a  particular 
aircraft  (about  500)  and  excluding  those  which  are  overhauled  during  periods  1  and  2,  and 
those  without  12  full  months  of  data  during  this  time  -  203  result  (Mitchell,  1976).  As  an 
illustration  of  the  data,  three  aircraft  have  ten  aborts  in  period  1  followed  by  six  in  period  2. 
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FIGURE  30  Density  of  (12)  Constructed  from  N(0,1). 

(p^.Py.p^)  ■  (2.0,2  .0,- 1.0), 

<«WV  “  (2*°°’' 11-28, 100)’  P  “  *  °-27, 

(p^jiy)  « (-  1.00,037),  (o*,oJ)  -(1.47,0.27), 


FIGURE  31  Density  of  (12)  Constructed  from  N{0,1). 
(PyPyP^)  ■  U.0,2.0,3.0), 
(«V^V-(2-^-0°.*26W.  p  -  -  021, 
(p^  -  (-  0.63,- 1.00),  (o*,oJ)  •  (2.11,1.47), 


_TJ  1  -  — ^  - -  1 

-U  -14  -U  HI  «U  •  U  U  U  t«  11 

FIGURE  32  Density  of  (12)  Constructed  from  N(0,1), 

(o^o^e^)- (2.00^00,0.0),  p-03, 
(p^Hy)  -  (-  LOO,- 1.00),  (o*,oJ)  -  (1.47,1.47) 


-u  -u  -u  <ti 


U  U  U  L4  U 


FIGURE  33  Density  of  (12)  Constructed  from  N(0,1), 
<PyPyPv)  -  (4.0.4.0.4.0), 

(Oj.cyct^)-  (2.00,2.00,  -  46.11),  p  -  -  023, 

(MyPy)  -  (-  0.47,-  0.47),  (o*,oJ)  -  (136,136) 
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Table  3  gives  the  data  for  387  aircraft  of  the  same  type.  Here  an  aircraft  is  included  if  it  has 
six  full  months  of  reported  data  during  adjacent  time  periods,  (3,4),  to  a  common  overhaul. 


The  margins  of  each  of  the  bivariate  distributions  are  first  assumed  to  be  the 
negative  binomial  p.tn.f.  given  by  Johnson  and  Kotz  (1969), 

Pr[X  =  *]  =  (x!(aa  1)>'  T+b  i7b*;a>0,b>0,x  =  0,l,2 . 


(13) 


Table  4  gives  descriptive  statistics  for  the  data  and  it  is  apparent  that  the  univariate 
negative  binomial  distribution  is  an  appropriate  choice  to  begin  modeling  the  margins. 

We  select  distribution  (12)'  with  margins  (2)  constructed  from  (13)  to  fit  the  bivariate 
data  We  simultaneously  estimated  the  ten  parameters  using  least  squares  invoking  a  NAG 
(Numerical  Algorithm  Group  1983)  nonlinear  minimization  routine.  We  originally 
estimated  the  parameters  by  considering  only  the  sum  of  squared  errors,  (SSE),  between  the 
empirical  and  expected  p.d.f.'s.  While  the  SSE’s  were  smaller  than  those  presented  m  Table 
4,  the  resulting  bivariate  models  failed  to  approximate  the  marginal  and  bivariate  moments. 
Thus  we  minimized  an  objective  function  composed  of  the  usual  SSE  of  the  p.d.f.  s  plus 
weighted  SSE’s  of  the  margins  first  two  moments  and  the  first  bivariate  mixed  moment. 

The  parameter  estimates  and  results  are  given  in  Table  4.  Thus  we  were  able  to  consider 
not  only  the  usual  ordinary  least  squares  but  also  the  discrepancies  between  the  margins 
means  and  variances  and  the  bivariate  correlation.  We  attempted  to  include  the  higher 
univariate  moments,  Px  and  P2,  and  Mardia's  bivariate  moments,  P1>2  and  P2j2.  in  the 
weighted  SSE's  ,  but  these  efforts  were  not  fruitful  due  to  the  large  variances  of  the  higher 
order  moments.  Typically  we  observed  agreement  between  the  observed  and  expected 
moments  and  moment  ratios  but  the  expected  observations  were  about  ten  percent  less  than 
the  empirical. 

We  see  in  Table  4  a  very  nice  fit,  not  only  of  total  observations,  but  of  the  margins’ 
means  and  variances  and  of  the  bivariate  correlations.  Of  interest  is  the  unexpected  values 
of  the  estimated  parameters  for  the  203  aircraft  data  set.  We  attempted  to  set  a,  for  margin 
Ax(x)  to  zero  but  found  the  results  to  be  extremely  sensitive  to  this  parameter.  We  can  see 
the  effects  ap  and  p,  as  indicated  in  part  3  above,  by  the  estimates  for  margin  A2(4).  The 
original  method  of  moment  parameters  are  (a,b)  =  (2.01,2.98). 

8.  Summary.  We  have  presented  an  intuitively  appealing  system  of  multivariate 
distributions  which  has  the  Neyman  alternative  as  its  genesis.  We  have  considered  several 
univariate  and  bivariate  versions  and  a  combination  of  versions  to  demonstrate  the 
flexibility  of  construction.  Finally,  we  used  an  innovative  approach  which  captures  the 
moments  to  model  data  which  had  eluded  successful  fitting  for  over  two  decades. 


Further  versions  of  the  system  need  to  be  considered.  We  have  investigated  versions 

of  (1)  when  A(0j)  consisted  of  transcendental  functions,  but  little  progress  has  been  made. 
Higher  dimensional  versions  of  the  system  must  be  explored.  While  we  have  obtained  two 
trivariate  densities  through  arduous  differentiation,  little  is  known  about  them.  A  clearer 
fitting  technique  is  required,  such  as  maximum  likelihood.  Finally  a  test  of  fit  specific  to  the 
system  is  required.  The  distribution  of  the  statistic  is  not  known  at  this  time.  If  random 
variates  are  required,  we  can  simulate  the  system  through  the  simulation  of  the  margins 
used  in  its  construction.  With  these  advances,  we  will  have  a  new  and  complete 
multivariate  system  for  modeling  and  inference. 
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Margin 

Aj(x) 

A2(y) 

A3(x) 

A4(y) 


TABLE  4. 


descriptive  Statistics  and  Estimates  of  Paramet*  , 

eters  for  Bivariate  Abort  Data. 
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EVALUATION  OF  COMMUNICATIONS 
THROUGHPUT 


Virginia  A.T.  Kaste,  Ann  E.M.Brodeen,  and  Barbara  D.  Broome* 
U  S  Army  Ballistic  Research  Laboratory 
Aberdeen  Proving  Ground,  Maryland 

Abstract 


A  controlled  laboratory  experiment  was  conducted  during  the  summer  of 
199 1  to  evaluate  the  Combat  Net  Radio  Network  (CNRN)  performance  of  sev- 
'  eral  emulated  Advanced  Field  Artillery  Tactical  Data  System  (AFATDS)  nodes 
communicating  via  existing  Tactical  Fire  Direction  System  (TACFIRE)  protocol 
using  Single  Channel  Ground  and  Airborne  Radio  System  (SINCGARS).  The 
purpose  of  this  experiment  was  to  examine  the  effects  of  four  levels  of  message 
length  and  four  levels  of  message  transmission  rate  on  network  throughput  and 
delay.  In  addition,  radio  transmission  mode  was  considered  by  running  the  test 
once  using  single  channel  transmissions  and  once  using  frequency  hopping. 
Three  replications  of  a  4  x  4  full  factorial  design  were  made  for  each  test. 
Analysis  of  variance  techniques  as  well  as  other  forms  of  network  analysis  were 
utilized  to  examine  the  significance  and  measure  the  effects  of  the  three  net¬ 
work  parameters.  These  analyses  provide  information  on  communications 
thresholds  for  the  TACFIRE  protocol  using  CNRNs.  These  thresholds  should 
be  considered  when  designing  AFATDS  communication  architectures  and  pro¬ 
tocols.  This  paper  examines  the  results  with  respect  to  throughput. 


1.  INTRODUCTION 


The  purpose  of  a  network  is  to  serve  as  a  carrier  of  information  from  one  point  to 
another.  To  measure  a  network’s  effectiveness,  one  must  determine  whether  the  messages 
the  network  services  arrive  at  their  destination  correctly  and  in  time  to  be  useful.  We  will 
refer  to  the  amount  of  correctly  passed  information  as  “throughput”  and  the  amount  of 
time  required  to  pass  that  information  as  “delay.”  There  are  a  number  of  parameters  that 
can  impact  throughput  and  delay,  for  example,  the  number  of  messages  to  transmit,  the  size 
of  those  messages,  the  number  of  nodes  on  the  network,  the  communications  protocol,  and 
the  communications  hardware. 

Simulation  is  a  widely  accepted  means  of  examining  the  changes  in  network  per¬ 
formance  resulting  from  a  change  in  hardware  or  communications  protocol.  Simulations, 
however,  require  input.  They  take  information  like  the  probability  a  message  will  collide, 
the  expected  delay  in  message  transmission,  or  the  arrival  rate  of  messages  at  a  given  node, 


•The  authors  would  like  to  acknowledge:  Mark  Thomas,  US  Army  Human  Engineering  Laboratory,  and 
Lenneth  G  Smith,  US  Army  Ballistic  Research  Laboratory  (BRL) ,  for  developing  and  modifying  the  soft¬ 
ware  drivers  used  in  this  test;  Charles  Hansen.  BRL,  for  developing  the  scenarios;  Holly  Ingham.  BRL,  and 
~homas  DiGiacinto.  US  Army  Test  and  Evaluation  Command,  for  developing  the  Net  Monitor  program, 
md  Paul  Broome,  BRL,  for  his  expertise  in  improving  queries  to  the  database. 
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3'  EXP£Rimental  design 

3.1  Factors 
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J  Z  Design  Matrix 

combinations  wa^mtho**13*  shortest  reasonable  tim 

quired  a  minimum  of  16  hours^  **“  testi"8  of  3,1  16  test  comb  *"*  °n*  °f  the  16  test 
completed  in  one  day,  a  random- (  °i  3  ^  Nation)  llT T°nS  would  h^e  re 
day-to-day  variability  wortd  not  '"complete  block  design  Ji  reabstlcalJy  could  not  be 
"*d  into  blocks  of 1?Z Z^T * 1 he  re^s.  Se  l7£°*V*  “  °rd«  that 
assignment  of  the  test  rn_,v  •  '  and  ^e  four  blocks  wer*  «  6  test  combinations  were  di- 

tem™’  *”  WWch  a  different  se^ofthr10  bl°Cks  Was  based  °n  a^nfo^  41^  period’  The 

term  were  completely  confoimH  a  °?ree  of  ^e  nine  degrees  of  t*  n^0undmg  scheme.  This 
arrival  rate  and CaCh  «Phcarifn  sl^r  ? 
pon  throughput  could  be  ,iLT*5  and  interaction  of  d  ?at  ^  cffe«s  of  mes- 


206 


Xi 

i 


* 


)u 


Figure  4.  The  Server  and  Its  Queues. 
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3.3  Design  Limitations 
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Table  1.  Mean  Throughput  (bits/sec)  by  Experimental  Condition 


Looking  at  the  moan  throughput  for  aach  laval  of  message  length .m  Table L  one 
sees  a  significant  increase  in  throughput  as  message  length  mcreased  from  48  to  352  cha 
tern  *  Similarly  an  increase  in  arrival  rate  increased  the  mean  throughput  Comparing  tire 
ters.  simua  y,  different  levels  of  message  arrival  rate  for  each  level  of 

S' ss  -itTsrssrs  ™  '£££ 

were  generally  empty. 

When  a  factor  is  significant,  techniques  are  available  by  which  to  partition  its  over¬ 
all  sums  of  souares  with  ( p  -  1 )  degrees  of  freedom  into  ( p  -  1 )  separate  sums  of  squares, 
each  with  one  degree  of  freedom.  The  different  components  can  then  be  interpreted  as 


209 


10  ‘MdBdc.  etC  i 

The  individual  de  '*  of  recession  of  respond  on  ihe 

There  were  •  .  0t^  channel  anH 

4  2  ne  Soumov  Norn*  "M  *"»*»*■ 

°nParametric  Test 

Hie  Smim - 


The  Sm-  —  CSt 

and  it  is  desired  tn^*'  °ne  from  each  of  two6  tW°  SamP,es  have  been  w  °Ughput  data. 

- — s^^Z’saxsvSS^ 

asi,  „  ’Je  hypothesis  for  the  e„  *“ C“ons  a®*ia,ed 

w  ■ and  -  -  fizz 

Ho  •  pr  \  *  W  and 

ThC  a,ternative  hypothesis  was*  "  ^  f°ra,,Xfrom  to  +«,. 

dtea"o=  tew«„'^'d  K^tJVZ^ °"e  °f* 

-  WS' 7??,^*?“  »iC  2??  "  -  *—  verdca, 

''*»«<£% zz*  '-<*  a:a”obe reiKi““ 

*M  7/16.  Therefr,,  ?aWd  •»  «/H.  From  ,??P  “  °f  «S“ she.  The  £*“  ”  a  of 

^caoce  Cm ; ’ ^ pTdet  aTPHa'' 'ab,a- f°,r 
wroom.  *«■*.  IToV 

.  ,  '  *“*w  ‘'atisHcalfy 

Data  Sunuuajy 

In  this  section  the  following  defi  •  • 

th^0U^I>pu,  is  -red: 


j 


DEGREES  OF 

SUM  OF 

F 

SOURCE 

FREEDOM 

SQUARES 

RATIO 

Nrpluations 

2 

0.00 

0.00 

n.s.d. 

II links  within  Replications 

9 

0.10 

0.01 

s.s.d. 

Message  Length  (L) 

3 

18.45 

6.15 

Linear 

Quadratic 

Cubic 

.1 

1 

1 

16.68 

1.71 

0.06 

16.68 

1.71 

0.06 

s.s.d. 

s.s.d. 

s.s.d. 

Anival  Rate  (R) 

3 

2.32 

0.77 

s.s.d. 

Linear 

Quadratic 

Cubic 

1 

1 

1 

1.77 

0.46 

0.09 

1.77 

0.46 

0.09 

.  s.s.d. 

s.s.d. 
s.s.d. 

Message  Length 

9' 

0.22 

0.02 

n.s.d. 

x  Arrival  Rate  (LR) 

0.01 

1  nor 

21 

0.19 

Total 

47 

21.28 

Table  2.  Analysis  of  Variance  (Effect  on  Throughput  -  Single  Channel) 


n  \  cl.  -  no  significant  difference 
*  s  cl.  -  statistically  significant  difference 


SOURCE 


DEGREES  OF  SUM  OF 
FREEDOM  SQUARES 


Replications 


2 


0.01 


Blocks  within  Replications 


9 


0.11 


Message  Length  (L) 

Linear 

Quadratic 

Cubic 

Arrival  Rate  (R) 

Linear 

Quadratic 

Cubic 

Message  Length 


17.41 

15.72 

1.62 

0.07 

1.54 

1.20 

0.27 

0.07 

0.21 


x  Arrival  Rate 
Error 

Total 


(LR) 


21 

47 


0.18 

19.46 


MEAN 

SQUARE 


0.00 

0.01 

5.80 

15.72 

1.62 

0.07 

0.51 

1.20 

0.27 

0.07 

0.02 


0.01 


F 

RATIO 


n.s.d. 


s.s.d. 

s.s.d. 

s.s.d. 

s.s.d. 

s.s.d. 

s.s.d. 

s.s.d. 

s.s.d. 

n.s.d. 


Table  3.  Analysis  of  Variance  (Effect  on  Throughput  -  Frequency  Hopping) 
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Figure  5.  Normalized  Throughput  vs  Messge  Length 
Grouped  by  Arrival  Rate  (Single  Channel) 
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bps.  If  the  Hamming  code  is  considered  overhead,  the  throughput  dropped  to  360  bps  for 
SC  and  325  bps  for  FH.  The  limits  of  the  CNRN  using  TACFIRE  protocol  identified  in 
this  experiment  should  be  considered  when  modeling  or  designing  communications 
architectures  and  protocols. 
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Abstract:  We  consider  the  problem  of  identifying  the  class  of  time  series  model  to  which 
a  series  belongs  based  on  observation  of  part  of  the  series..  Techniques  of  nonpar ametric 
estimation  have  been  applied  to  this  problem  by  various  authors  using  kernel  estimates  of 
the  one-step  lagged  conditional  mean  and  variance  functions.  We  study  cumulative  versions 
of  Tukey  regressogram  estimators  of  such  functions.  These  are  more  stable  than  estimates  of 
the  mean  and  variance  functions  themselves  and  can  be  used  to  construct  confidence  bands. 
Goodness-of-fit  tests  for  specific  parametric  models  are  also  briefly  discussed* 

1  Introduction 


Currently  one  of  the  most  challenging  problems  in  nonlinear  time  series  analysis 
is  to  identify  the  class  of  time  series  model  to  which  a  series  {Xt}  belongs  based  on 
observation  of  part  of  the  series,  {Xt,  <  =  0,l,...,n}.  Techniques  of  nonparametric 
estimation  have  been  applied  to  this  problem  by  Robinson  (1983),  who  studied  the 
large  sample  properties  of  kernel  estimators  of  lagged  conditional  means  E(Xt\Xt-j) 
and  E(Xt\Xt-j ,  Xt-k)  for  various  j  and  k  values.  Such  estimators  are  useful  for 
detecting  nonlinearities  graphically,  see  Tong  (1990,  p.  12).  This  approach  has  been 
further  developed  by  Auestad  and  Tjpstheim  (1990)  who  focused  on  kernel  estimates 
of  the  one-step  lagged  conditional  mean  and  variance  functions  A(x)  =  E(Xt  |X’<_i  = 
x )  and  7(x)  =  var(Xt|Xt_i  =  x)  for  the  purpose  of  identifying  common  nonlinear 
models  such  as  threshold  (Tong,  1983)  and  exponential  autoregressive  (Ozaki,  1980). 

In  the  present  paper  we  discuss  an  approach  to  this  problem  based  on  es¬ 
timation  of  cumulative  versions  of  the  conditional  mean  and  variance  functions, 
A(-)  =  J  X(x)dx  and  T(-)  =  fa  -y(x)  dx,  where  a  is  an  appropriately  chosen  point 

in  the  state  space.  These  estimators,  denoted  A  and  f,  are  obtained  by  integrating 
Tukey  regressograms  for  A  and  7.  The  reason  for  considering  cumulative  versions  of 
the  conditional  mean  and  variance  is  that  it  is  possible  to  derive  functional  limit  the¬ 
orems,  whereas  available  asymptotic  results  for  kernel  or  regressogram  estimators 
of  A  and  7  are  only  useful  pointwise.  We  advocate  A  and  T  as  natural  ‘signatures’ 
of  a  time-series  in  preference  to  estimates  of  A  and  7. 

We  present  a  functional  limit  theorem  for  A  which  holds  under  conditions  that 
can  be  readily  checked  when  {Xt}  is  a  Markov  chain.  This  result  can  be  used  to 


1  Research  supported  by  Army  Research  Office  Grant  DAA03-90-G0103. 

2  Research  supported  by  the  Air  Force  Office  of  Scientific  Research  under  Grant  AFOSR91-0048. 
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construct  confidence  bands,  which  are  more  helpful  than  confidence  intervals  in 
assessing  plots.  Tins  is  the  chief  benefit  from  estimating  cumulative  conditional 
means  and  variances  rather  than  A  and  7  themselves.  Another  benefit  is  that  A 

and  T  are  relatively  insensitive  to  variations  in  bandwidth  compared  to  the  kernel 
or  regressogram  estimators. 

e  ,  Y6  fS°  ?riefly  describe  811  application  of  our  approach  to  omnibus  goodness- 

^nc  ion  2 ^  P“  m0ddS  °f  the  f°rm  =  »(•.*).  where  ,  is'a  ^ 
function  and  0  is  an  unknown  parameter,  e.g.  a  linear  model.  Robinson  (1983)  has 

given  a  test  for  hneanty,  but  his  test  has  the  diadvantage  that  it  is  only  applicable 

a  a  small  number  of  discrete  locations.  Other  formal  tests  for  linearity  found  in  the 

literature  are  parametnc-constructed  by  arranging  the  linear  model  to  be  nested 

wthin  various  larger  parametric  models,  see  Tong  (1990,  Section  5.2).  We  propose 

omnibus  test  based  on  a  comparison  of  A  and  a  smoothed  version  of  f  g($  x)  dx 

Ndson( 19 78^ere  $  “  **  C°nditi°nal  least  s<iuare  estimator  of  see  Klimko  and 

WtTnrT6  T  Tme  COnnectioi;s  between  the  present  paper  and  cumulative  hazard 

see  the  survey  articles  of  Andersen  and 
Borgan  (1985)  and  McKeague  and  Utikal  (1990a).  In  fact  A  is  closely  related  to 
an  estimator  introduced  by  McKeague  and  Utikal  (1990b).  Martingale  techniques 
p  ay  an  important  role  here,  as  they  do  survival  analysis. 

2  Estimation  of  A  and  r 

Assume  that  the  conditional  mean  and  variance  of  A*  given  X ,  , 

only  depend  on  This  property  holds,  for  example,  if  {Xt}  is  a  M^kov  *£ 

In  particular,  an  important  example  is  the  nonhnear  autoregressive  process 


Xt  —  A(Jf<_i)  -f  <T(Art_i)et, 


(1.1) 


where  {et}  are  iid  with  zero-mean  and  unit  variance  and  7  =  a2.  In  this  case 
the  time  senes  is  characterised  by  the  triplet  (A,7,  distribution  of  e„).  We  me 

^imanly  interested  in  A  and  7.  It  is  assumed  throughout  that  <Jf,}  is  stationary 
with  a  marginal  density  denoted  /.  1  5  y 

We  restrict  attention  to  estimation  of  A  and  T  on  a  fixed  interval  fa  b)  The 
regressogram  estimators  A  and  7  are  defined  as  follows.  Let  J,  r.  be  a  narti 

and  dlimerTl^forr"^  ^  °f  th’e 


A(r)  =  (nwj(x))  1  6  XS}XU 

t~l 

n 

7(x)  =  (nwnf(x))~1  €  Xr}(X/  —  A(x))2, 

f=i 
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where  /  is  the  histogram  estimator  of  /  given  by 

n 

f(x)  =  (nu)„)-1  Hxt- 1  €  Zx}’ 

t= 1 

and  /(•)  is  the  indicator  function.  Regressogram  estimators  were  introduced  by 
Tukey  (1961)  and  have  been  studied  recently  by  Diebolt  (1990). 

Introduce  the  estimators 

A(-)  =  L  A  (x)dx  and  f(-)  =  L  7(x)  dx. 

Although  it  is  possible  to  use  the  more  sophisticated  kernel  estimators  to  yield  better 
estimates  of  A  and  7,  there  is  little  to  be  gained  from  using  them  in  A  and  T,  which 
are  less  sensitive  to  variations  in  A  and  7.  We  prefer  the  regressogram  estimators  due 
to  their  computational  simplicity.  In  practice,  care  needs  to  be  taken  in  choosing 
the  interval  [a,  b]  and  the  bins  to  ensure  that  the  regressogram  estimates  are  not  too 
unstable.  For  good  results,  the  binwidths  should  be  of  comparable  size  (we  have 
taken  them  to  be  of  equal  size  merely  to  simplicity  the  notation),  and  there  should 
be  at  least  5  observations  per  bin. 

Ideally,  in  order  to  carry  out  inference  on  A,  using  a  confidence  band  for  A  say, 
we  would  like  to  find  the  limiting  distribution  of  >/n(A- A).  However,  for  technical 
reasons  we  are  only  able  to  obtain  a  satisfactory  weak  convergence  theory  when  A 
is  replaced  by  the  smoothed  version  of  A  given  by  A *(z)  =  /0  A  (r)  dx.  where 

A*(x)  =  J  f*(u)X(u)du  j  J^f*(u)du 

and  f*  is  the  histogram  estimator  of  /  determined  by  a  finer  partition  of  [a,  b] 
consisting  of  intervals  of  equal  length  u>*.  We  regard  A*  as  a  ‘surrogate  for  A, 
which  is  reasonable  since  A*  converges  uniformly  in  probability  to  A.  However 
y/n(A*  —  A)  may  not  be  asymptotically  negligible.  If  it  is  (for  example  if  A  is 
piecewise  constant  over  Xj, . . .  ,X<jn  for  some  n)  then  A*  is  not  needed  and  we  can 
deal  with  A  directly. 

The  asymptotic  distribution  of  A  is  given  by  the  following  result,  for  which  we 
assume  that  A  is  Lipschitz,  EX$  <  00,  (X0,Xt)  has  a  bounded  joint  density  for  all 
f  >  1,  and  the  marginal  density  /  is  continuous  and  does  not  vanish  on  [a,  b}. 

THEOREM.  Suppose  that  supx€[a  6]  var[/(x)]  =  o{wn),  nwn  00,  nw\  ->■  0  and 
w*  ~  w2n  as  n  -*  00.  Then  v/n( A  -  A*)  converges  in  distribution  a  continuous 
Gaussian  martingale  with  mean  zero  and  variance  function  H(z)  =  /*  7 (x)/f(x)  dx. 

A  proof  of  this  result  can  be  found  in  McKeague  and  Zhang  (1991).  A  large 
class  of  stationary  Markov  processes  {Xt}  that  satisfy  the  first  condition  of  the 
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provided  that  /  is  bounded  there  Thus  the  rn  jv  ^  uniformly  over  [a,  b] 

case  if  ««*  ^  oo.  la  a  particular  ^  tWm  holds  «  thta 

ergodicity  (Nummelin,  1984),  which  implies  str  ™  —  easier  to  check  geometric 
rate.  Geometric  ergodicity  s  in  turn  mmng  With  a  Metric  mixing 

Tweedie  (1983).  *  “  lmphed  *  a  readily  checkable  condition  of 

We  now  turn  some  possible  applications  of  this  result. 

Confidence  bands.  It  can  be  shown  that  • 

tent  estimator  of  H,  so  that  an  asymptotic  100(1  * *** a  &  UniformI^  c°nsis- 

glven  by  ymptotic  100(1  -  or )%  confidence  band  for  A*  is 


A(z)  ±  can-lf2H(J,yn(l+6{x)\ 

V  H(b)) 


x  €  [a,  b], 


th^Br^ni^bridge^oce^1!^0^^^^1!!^ B11  *  "fig’1*  ,B°^  «“«*  *°  » 

c0  can  be  found  in  Hall  and  Wellner  (1980).  ^  1985’  P'  U4^’  Tables  for 

two-sample  problem  of  te^mg^hetTer^o  inde^d^ fUnCtions  Consider  the 
regression  functions  A.  Denofe  the  various  tunT1"1  ‘T  have  identicaJ 
associated  with  the  two  series  by  using  a  subser'  Tl’  S™pk  S12es  Creators  etc. 
n  =  n,  +n2.  Then,  if  _  p/fo“  ‘  £  *  1  or }  ?  *  *t,  i  =  1.2.  Let 

satisfied  for  the  two  series  x/n(Ai  -  A  ^  f  condltlons  of  the  theorem  are 

Gaussian  martingale  with  mean  zero  Md  wi^ce  wtiM^0"  ‘°  “  continu°“s 


P'l 


A  /.(*)  P2  A  /,(*) 


dr, 


provided  that  Aj  =  A2  on  [a,  6]  and  w£(A*  -AM. 

to  zero.  The  latter  condition  holds  the  co  T  in  Probability 

mentioned  earlier.  Confidence  bands  for  A*  -  A  “  IMeoewi8e  instant,  as 

plots  of  such  bands  are  given  in  Section  3.  1  2  constructed  as  above.  Some 

a  parametric  family^8'.)  ^^©Vof  Pr°blem  oftesting  whether  A  belongs  to 
deterministic  fbnc&Jf1  J  0  is  a  Ic^ Here  *  is  a  Wn 
fa  M*)dx,  where  d°Sed’  bounded  of  R p  Let  ~A(z)  l 


A(X)  ~  fx/*(uM0,u)du  j £f*(u)du 
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and  0  is  the  conditional  least  squares  estimator  minimizing  Ylt=i(^t  —  g(9,Xt-i))2 . 
McKeague  and  Zhang  (1991)  have  shown  that,  under  the  parametric  model,  the 
process  y/n  (A  —  A)  converges  weakly  to  a  Gaussian  process  having  a  covariance 
that  can  be  estimated  consistently.  This  result  can  be  used  to  develop  graphical 
methods  for  detecting  departures  from  the  parametric  model  based  on  plots  of  A— A, 
or  to  give  formal  chi-squared  goodness-of-fit  tests. 


3  Simulation  study  and  example 

We  have  carried  out  simulations  using  three  model  examples  taken  from  Aues- 
tad  and  Tj0stheim  (1990): 

Model  1:  linear  autoregressive,  Xt  =  0.8Xt_i  +  et] 

Model  2:  threshold  autoregressive, 


f  —  0.3X*_i  +  e*,  if  Xt— i  <  0, 
(  0.8Xt_i  ■+■  €<,  if  Xt— i  >  0; 


Model  3:  exponential  autoregressive,  Xt  =  {0.8  —  l.lexp(— 50A'#2_i)}ATt_i  +  et. 

Here  et  is  Gaussian  white  noise  with  mean  zero  and  standard  deviation  0.1. 
Auestad  and  Tj0stheim  (1990)  checked  geometric  ergodicity  and  stationarity  for 
these  examples. 

We  restricted  estimation  of  A  to  the  interval  [—0.3, 0.3].  The  binwidth  was 
taken  as  wn  —  0.05  (same  as  Auestad  and  Tj0stheim,  who  plotted  point  estimates 
of  A  for  these  three  models).  Inspecting  the  plots  of  A  in  Figure  1,  we  find  that 
the  three  models  are  easily  distinguishable,  even  for  sample  size  as  low  as  250.  The 
parabolic  shape  of  the  linear  autoregressive  model,  and  the  ‘squashed’  parabola  of 
the  exponential  autoregressive  are  especially  distinct. 

Figure  2  shows  plots  of  differences  between  the  estimates  of  the  cumulative  re¬ 
gression  functions  in  the  two  sample  problem,  for  various  pairs  of  the  above  models. 
In  the  first  plot  in  each  row,  the  two  series  are  generated  using  the  linear  model  and 
the  zero  function  is  contained  within  the  band,  so  our  test  would  correctly  conclude 
that  the  regression  functions  are  identical.  In  the  other  plots,  the  zero  function  is 
well  outside  the  bands  and  the  test  correctly  concludes  that  the  regression  functions 
are  different. 

We  conclude  with  an  example  involving  real  data.  Consider  the  set  of  IBM 
daily  closing  stock  prices  from  late  1959  to  mid  1960  (period  I)  and  mid  1961  to 
early  1962  (period  II)  given  in  Tong  (1990).  The  daily  relative  change  in  price 
appears  to  be  stationary  and  is  used  in  place  of  the  raw  data.  Tong  (1990)  tested 
for  linearity  and  decided  that  period  I  is  linear  and  period  II  is  nonlinear.  Figure 
3  gives  a  plot  of  the  difference  between  the  estimates  of  the  cumulative  regression 
functions  in  the  two  periods,  along  with  the  95%  confidence  band,  using  dn  —  10. 
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The  confidence  band  does  not  contain  the  zero  function,  so  we  conclude  that 
the  regression  functions  for  the  two  periods  differ  significantly  from  one  another. 
Our  chi-squared  test  with  dn  =  8, 10  and  12,  and  degrees  of  freedom  L  =  2  and  4, 
gave  the  same  result. 


-0.02  0.0  0.02  0.04 

Figure  3.  Ai  —  A2  with  95%  confidence  band  for  IBM  stock  price  data;  dn  =  10; 

Ai  =  period  I,  A2  =  period  II. 
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Abstract 

Change  Analysis  “in  the  strict  sense”  is  concerned  with  the  problem  of  de¬ 
tecting  and  estimating  slow  and  abrupt  changes  in  the  probability  distributions  of 
successive  observations  Y(t)  of  a  variable  or  system.  This  paper  has  two  goals  (1) 
introduce  an  approach  to  Change  problems  by  introducing  analysis  of  Score  Change 
Processes  (whose  idea  is  to  study  if  a  model  to  a  whole  data  set  fails  to  fit  it  by 
“random  walking”  the  parameter  estimating  equations);  (2)  develop  analogies  be¬ 
tween  four  basic  statistics  problems,  corresponding  to  the  standard  assumptions 
made  about  a  sequence  of  observations  Y(t),  t  =  1,...  ,n;  test  the  hypothesis:  A: 
Distribution  of  specified  parametric  form,  B:  Independence,  C:  Identical  distribu¬ 
tion,  For  a  sequence  of  bivariate  observations  A((<),  Y(t))  one  would  like  to  test 
D:  Independence  of  X  and  Y.  Contents  are:  Introduction,  Change  analysis  in  the 
strict  sense  (test  Assumption  C),  Goodness  of  fit  (test  Assumption  A),  Spectral 
Analysis  (test  Assumption  B),  Four  phases  of  change  analysis,  Parametric  scores 
change  analysis,  Nonparametric  scores  change  analysis. 

1.  Introduction 

Data  y(l),...,Y(n)  which  can  be  regarded  as  continuous  random  variables 
observed  sequentially  can  be  called  indexed  data  or  a  time  series.  Classic  statistical 
inference  makes  three  basic  assumptions: 

Assumption  A.  Probability  law  of  each  Y  has  probability  density  belonging  to 

a  known  parametric  family  of  probability  densities  f{y,  0). 

Assumption  B.  Random  variables  Y(l), . . . ,  Y(n)  are  independent. 

Assumption  C.  Random  variables  Y(l), . . .  ,Y(n)  are  identically  distributed. 

Methods  for  detecting  (and  estimating)  the  fit  (and  the  nature  of  violations) 
of  these  assumptions  in  our  opinion  can  be  respectively  related  to  three  parallel 
theories: 

Theory  C.  Changepoint  analysis  or  change  analysis  (in  the  strict  sense). 

Theory  B.  Spectral  analysis  (time  series  analysis  in  the  frquency  domain). 

Theory  A.  Goodness  of  fit. 

We  believe  that  one  can  define  a  theory,  called  Comparison  Change  Analysis, 
which  is  intended  to  study  analogies  between  theories  A,B,C  (and  bring  the  insights 
of  the  theories  that  axe  more  developed,  such  as  spectral  analysis,  to  less  developed 
ones).  General  accounts  of  this  theory  are  given  in  Parzen  (1992),  (1991). 

The  assumption  that  the  data  is  observed  sequentially,  which  may  seem  to  limit 
the  applicability  of  Change  Analysis,  is  dropped  when  the  analogies  axe  extended 
to  the  bivariate  data  analysis  problem  which  considers  independent  bivariate  data 
(X(t),  Y(t)),  t  =  1,. . .  ,n,  and  desires  to  model  the  relation  between  X  and  Y  and 
in  particular  to  test 

Assumption  D .  X  and  Y  are  independent  random  variables. 

A  general  non-parametric  theory  of  testing  assumption  D  can  be  related  to 
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Theory  D.  Change  analysis  (random  effect). 

the°,rieS  ^  to  P.®”5  obtained  from  the  facts  that  in  each 
p  blem  the  first  step  in  analysis  is  to  define  a  dynamic  statistic  which  is  a  function 

mte*Tal  I0’1!  whose  asymptotic  distribution  (under  the  null  hypothesis 
that  the  assumptions  are  true)  is  either  a  Brownian  Bridge  or  a  related  process  The 
test  statistics  m  each  theory  are  analogous  to  the  nonparametric  test  statistics  that 
statisticians  have  developed  to  test  goodness  of  fit  for  equality  of  two  distributions. 

pnnrln«!b  in\ply  ls  to  choose  among  the  many  test  statistics  for 

goodness  of  fit  and  analogous  testmg  problems;  we  believe  we  should  be  optimistic 

*1°  develoP  Procedures  for  adaptively  choosing  appropriate 
test  statistics  which  not  only  test  the  null  hypothesis  but  also  suggest  Likely  models 
mstead  of  only  rejecting  the  null  hypothesis  Y 

2.  Change  analysis  in  the  strict  sense  (test  Assumption  C) 

i  •  theory°f  change  analysis  in  the  strict  sense  considers  data  Y(t)  t  =  1  n 
leaved  the  to"  °f  ***  (the  identity 

are  ienX  if61?1?  mean  estimator  of  the  true  mean  u  if  the  data 

Die  stStda^dl^flbU\ed)'*uet  °Y  denote  a  suitable  estimator  (such  as  the  sam¬ 
ple  standard  deviation)  of  the  true  standard  deviation  try  of  the  data  under  the 

assumption  of  identical  distribution  (which  is  assumed  to  be  finite). 

The  data  Y  is  transformed  to  normalized  data 

Y~(t)  =  ( Y(t )  -  Y~)/ay\ 

We  plot  the  normalized  data  as  a  sample  change  density  c“(r),  0  <  r  <  1  defined 
to  be  a  piecewise  constant  function  whose  value  is  equal  to  Y~(j)  on  the’intS 

(j  -  1  )/n  <  t  <  j/n,  for  j  =  1, . . . ,  n.  Note  that  $  c(r)dr  =  0,  c\T)dr  =  1. 

tnnkCteSle,!w  (cunialativ®  ““J*)  are  becoming  increasingly  important  diagnostic 
protss  on0<r  ^  ^  They  3X6  related  to  *he  saraPle  3.,^ 


C~(T)  =  /  c{t)dt. 
Jo  ' 


rvv\P°inti  T  ~  J  -  1, are  called  “exact”  values  of  r;  at  these  points 

C(t)  equals  a  cumulative  sum:  ’  p 


in  tb^°rk?f  W5y  Jhf  change  Pfocess  is  effective  means  of  detecting  change 

m  the  data  consider  its  behavior  under  two  models  for  K(.), 

imatelyK(')  “  deterministic  “d  linear.  "V  Y(t)  =  t,  then  at  exact  r  =  j/n  approx- 


The  graph  of  C~(r)  when 
mimimum  value  at  r  =  .5. 


C~(T)  =  Y(j)  =  12'5(r  —  .5), 

C-(r)  =  (-.5)12V(l-r). 

F(.)  is  linear  is  a  parabola  that  goes  from  0  to  0  with 
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If  y(.)  is  random  (independent  identically  distributed),  the  stochastic  process 
C~(r),  0  <  r  <  1,  can  be  shown  to  be  asymptotically  distributed  (as  the  sample  size 
n  tends  to  infinity)  as  a  Brownian  Bridge  stochastic  process  B(t ),  0  <  t  <  1,  which 
is  a  zero  mean  Gaussian  process  with  covariance  kernel  E[B(s)B(t)]  =  min(s ,  t)—st. 
Note  that  B(0)  =  15(1)  =  0,  and 


Vaxiance[B(r)]  =  r(  1  —  r). 


To  test  for  departures  from  Assumption  C  (identical  distribution)  one  tests  if  the 
observed  change  process  C~(t)  is  significantly  different  from  a  sample  curve  of  a 
Brownian  Bridge  which  can  De  expected  to  be  a  wiggly  (non-smooth)  curve  oscil¬ 
lating  about  the  horizontal  axis. 

A  related  process  that  plays  a  central  role  in  change  analysis  is  the  Change 
Test  Process  . 

crw  =  r(i)/(r(i-r))i. 


The"  fundamental  role  of  the  change  test  process  starts  with  the  fact  that  for 
fixed  r  =  j/n,  CT~(t)  can  be  shown  to  be  a  monotone  transformation  of  the 
classic  two-sample  Student’s  t-test  statistic  of  the  null  hypothesis  =  H2  in  the 
model  V' ( 1),  •  •  •  ,y(j)  is  Normal(/ii,  a2)  and  Y(j  +  1), . . .  ,Y(n )  is  Normal(//2, <r2)~ 
The  sample  means  and  variances  of  the  two  samples  F(l), . .  .,Y(j)  and  Y(j  + 
1),  ...,y(n)  are  respectively  denoted  Si~2  and  H2,^2^‘  The  pooled  sample 
variance  is  „  „  „ 

S~2  =  r5l-2  +  (1  _  r)S2~ 2 


One  can  verify  that 

b\  =  +  (r(l  -  r))(/x \  -  /i2*)2, 

HI  -  Y~  =  (1  -  r)(/if  -  H2l 


The  classic  two-sample  Student’s  t-test  statistic  is  n  ^  T,  defining 


T  =  (r(l-r))-5(/xf-/^2')/^. 


Define  R,  a  “correlation  version”  of  T,  by 

R 2  =  T2/(  1  +  T2),T2  =  R2/{  1  -  R2). 

Then  o  o 

i22  =  r(l-r)(/if-/i2“)2/4 

and  one  concludes  that  CT~(t)  is,  like  R,  a  correlation  type  statistic  since 

jj2  =  (r/(i-T)x,*i'-n2/4 

=  |CT-(t)12. 

We  can  consequently  express  Student’s  t-test  statistic  T  as  a  monotone  function  of 
CT~(t)  since  T  =  R/(l  -  R2)  h. 

Let  t“  denote  the  value  among  the  exact  values  r  =  j/n  (for  j  =  1, . . . ,  n  —  1)  at 
which  the  absolute  value  of  CT~{r)  achieves  its  maximum.  Under  the  assumption  of 
at  most  one  change  in  the  distribution  of  y(.),  CT~(r“)  is  a  test  statistic  for  change 
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and  its  time  of  occurence  is  consistently  estimated  by  r~  fa  result  established  bv 
Carlstein  (1988)). 

3.  Goodness  of  fit  (Test  Assumption  A) 

One  of  the  most  extensive  and  least  applied  branches  of  statistical  theory  is 
the  theory  of  goodness  of  fit  of  probability  models  to  observed  data.  Despite  its 
importance  (both  for  theory  and  practice)  it  appears  to  be  sparsely  taught  to  grad¬ 
uate  students  in  statistics.  The  chi-squared  goodness  of  fit  test  introduced  by  Karl 
Pearson  in  1900  is  regarded  as  one  of  the  top  20  achievements  in  modem  science. 
How  can  one  explain  the  neglect  of  instruction  in  its  theory?  One  explanation  may 
be  that  its  theory  is  often  taught  rigorously  as  a  study  in  pure  probability  theory 
rather  than  developed  vigorously  for  its  statistical  interpretation. 

Let  Y(t),  t  =  1 , . . . ,  n,  be  a  random  sample  of  a  continuous  random  variable  with 
true  distribution  F(y)  =  F(y,8 q)  belonging  to  a  finite  parametric  family  F(y,9). 

The  true  quantile  function  is  F-1(u;0o),  0  <  u  <  1.  The  sample  distribution 
function  is  denoted 

F~(y )  =  fraction  of  sample  <  y. 

Let  9  denote  the  maximum  likelihood  estimator  of  9.  Stochastic  processes  whose 
asymptotic  properties  are  of  interest  (for  both  theory  and  practice)  are 

F~(y)-F(y,60), 

F~(y)  -  F(y;  **), 

F(y,r)-F(y,9 0), 

evaluated  at  y  =  F~^(u;  0q),  0  <  u  <  1.  We  denote  such  a  process  C~(u), 0  <  u  <  1, 
to  emphasize  its  analogy  to  a  sample  change  process.  We  use  functions  of  u  to  study 
changes  of  distribution,  and  functions  of  r  to  study  changes  of  models  fitting  data. 

The  testing  and  estimation  procedures  of  goodness  of  fit  theory  can  be  organized 
into  four  phases  summarized  (in  section  5)  in  our  discussion  of  the  four  phases  of 
change  analysis. 

4.  Spectral  Analysis  (Test  Assumption  B) 

One  approach  to  testing  the  assumption  of  independence  is  to  consider  as  an  al¬ 
ternative  hypothesis  for  the  data  Y(t),t  =  1, . . . ,  n,  that  it  is  a  zero  mean  stationary 
time  series  with  covariance  function,  defined  for  v  =  0,  ±1,  ±2, . . . , 

R(v)  =  E[Y(t)Y(t  -  u)] 

and  spectral  density  function,  defined  for  0  <  u  <  1, 

00 

/(u>)=  53 

V  — — -oc 

The  sample  spectral  density  is  defined 


/*(  u>)  =  (27m)-1 1 Yity2™1 12 
<=1 
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with  sample  distribution  function  (on  0  <  w 


<1) 


F»  = 


Normalized  versions  of  these  functions  are 

/*'(«)  =  /»/F-(l), 

F*»  =  F-(w)/F-(  1). 

Analogues  of  the  sample  change  density  and  sample  change  process  are 

c-M = r»  - 1, 

C~(u>)  =  F*~(u>)  -  w. 


5.  Four  phases  of  change  analysis 

A  sample  change  process  C~(t),0  <  t  <  1,  is  a  dynamic  statistic  (sample 
path  of  a  stochastic  process)  which  often  can  be  shown  to  satisfy  under  the  null 
hypothesis  of  “no  change”  the  null  hypothesis  H0  :  C~(.)  is  a  Brownian  Bridge  (or 
a  related  hypothesis).  The  statistical  analysis  of  C  (.)  has  four  phases: 

Phase  1:  Graphical  analysis',  is  the  plot  of  C  (t),  0  <  t  <  1,  oscillatory,  a 
deterministic  parabola,  other  patterns. 

Phase  2:  Non-linear  functionals.  One  tests  Hq  by  computing  the  values  of  test 
statistics  (whose  asymptotic  distributions  under  Hq  can  be  deduced  from  the 
theory  of  empirical  processes) 


/  |C»|2dr, 

JO 

f\\C~(r)\2/r(l  -  r))dr, 

Jo 


max  |C~(t)|, 
0<T<l‘ 

max  |C'~(t)|/t(1 
r=j/n 


T )• 


Phase  3:  Linear  functionals.  For  various  score  functions  K(t),  called  change 
score  functions,  one  computes  the  linear  functional  (or  component) 

C~(K)  =  J  K{r)dC~{r)  =  K(T)c(r)dT 

One  can  often  write  approximately 

n 

C~(K)  =  (1/n)  Y,  K(U  ~  -5)/"K(  U  ~  •*)/“) 
i= i 

The  score  function  is  usually  chosen  as  a  sequence  of  Orthonormal  functions 
•  •  • ,  especially  the  Legendre  polynomials,  which  test  against  patterns  m 

the  change  density  c~(t). 
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of  data  (score  the  data) 

z.ti  ’ « fiftSs  syss 

etc.)  form  a  smooth  estimator  c(r)  of  the  change  density, 
the  SXeeXo?ih°snpafDere  ‘  W°U'd  re1“re  a  book  and  is  beyond 

6.  Parametric  scores  change  analysis 

^SSSSSeS5“R°”^ft  ^ theVoW^t1onsr  Zy“be 

statistics,  or  more  precisely  th^Fk W  transf°rm^.tlons  are  essentially  the  sufficient 
mode.  ,4;  *)  for^dS  (<)T= 

logiel1hooTmUm  bkeKhood  eslimator  r  “  Gained  by  maximizing  the  average 


Define  score  functions 


m  =  (i/n)'fr  log  f(Y(ty,e) 

t= 1 

5j(y;  0)  *a/^-  log  /(r;0) 


Jl~r  likelihood  estiraator  is  the  solution  of  the  estimating  equations  foi 

n 

(l/n)J2Sj(Y(ty,6')  =  0. 

t= l 

m/n%  changepoin,t  Tt.= 

time  m  m  the  sense  that  approximately  ’  up  to  the 

m 

(i/«)^5i(y(0;n  =  o. 

i=l 

We  define  the  score  change  process  to  linearly  interpolate  its  values  at  r  =  m/n, 

m 

C'(r-,Sj)  =  (l/n)J^S](Y(tyr) 

where 

s;(r;r).  =  s^Yj-yE^Y;)-)). 
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We  form  k  score  change  processes,  for  j  =  1, . . . ,  k. 

We  call  this  approach  “random  walk  your  normalized  scores.”  We  are  develop¬ 
ing  the  probability  theory  of  the  score  change  processes. 

These  theoretical  concepts  can  best  be  understood  through  examples.  Consider 
a  gamma  distribution  model 


f(y,u,0)  =  (0vT(u))  V  !exp (~y/0) 


where  0  is  a  positive  scale  parameter,  assumed  unknown,  and  y  is  a  positive  shape 
parameter,  assmed  known.  One  can  show  that  the  score  function  of  the  parameter 

S(K;«)  =  (l/«)((r/«)-^); 

the  maximum  likelihood  estimator  is 


6-  =  Y-/v, 

the  normalized  score  function  evaluated  at  the  maximum  likelihood  estimator  of 
the  parameter  may  be  shown  to  be 

S*(y(<);#)  =  --5((K(()/n- 1). 

To  test  the  observations  Y{.)  for  change,  one  forms  the  maximum  likelihood 
score  change  process  C~(r;5*),  0  <  r  <  1,  and  tests  if  this  dynamic  statistic  is 
significantly  different  from  a  sample  path  of  a  Brownian  Bridge  stochastic  process. 
A  linear  functional  of  the  change  process  corresponding  to  the  score  function 

K(t)  =  12'5(t  -  .5) 


C(K,S *)  =  (l/n)£(12 v)*((Y(i)IY-)  -  l)((i  -  .5)/n) 

i- 1 

=  (12^(l/n)£y(*)(((-.5)/n)/y- 
t= 1 

Under  the  null  hypothesis  of  no  change  the  asymptotic  distribution  of  n  5C~(K,  S *) 
is  Normal(0,l). 

An  example  of  an  application  of  this  statistic  is  in  Hsu  (1979)  where  it  is 
presented  as  a  test  designed  for  a  small  change  in  the  scale  parameter  0  of  an 
independent  Gamma  distributed  sequence,  derived  by  Kander  and  Zacks  (1966)  by  a 
Bayesian  analysis  assuming  the  changepoint  r  is  uniformly  distributed  in  time.  This 
test  statistic  is  derived  in  our  approach  as  analogous  to  a  component  in  standard 
goodness  of  fit  analysis. 

7.  Nonparametric  scores  change  analysis 

Our  approach  to  change  analysis  recommends  that  one  compute  and  interpret 
several  change  processes  formed  from  several  transformations  of  the  original  data. 
In  addition  to  (or  instead  of)  various  parametric  score  change  processes,  one  can 
define  various  nonparametric  score  processes  for  a  data  sequence  Y (<),  t  =  1, . . . ,  n. 
Define: 
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sample  distribution  function  F~(y); 

sample  probability  mass  function  p~(y)  =fraction  of  sample  equal  to  y; 
mid-distribution  function  P~(y)  =  F~(y)  -  .5 p~(y). 


The  mid-rank  data  transformation  forms  P~(Y(t)),  t  =  v  „  When  all  Y 
S*"S  are  distinct  f>*(r(i))  =  (Rank(K(t))  -  £  record tbkdSrita 

of  mid-ranks  over  the  most  used  definition  Rank  (Y(i))/(n  +  1). 

„  •ch°oses  a.  data  score  function  J(u),  0  <  u  <  1,  suitable  for  testing  non- 

parametncally  various  types  of  changes  in  the  distribution  of  the  data  (especially 
changes  in  location  or  scale  parameters).  A  typical  choice  for  J(u)  is  a  Legendre 
polynomial  normalized  to  satisfy  v  '  6 


=  1. 


\ 

itJ^f  x?h^el of  chanSe  analysis  to  the  transformed  data  sequence 
y  [t)).  In i  the  third  phase  one  examines  and  interprets  linear  functional  tests 
lor  change  of  the  form 


n 

J)  =  (l/n)^  K((t  -  .5 )/n)J(P~(Y(t)) 

t= 1 

for  suitable  change  score  functions  K(t).  One  can  usually  show  that  under  the  null 

hypothesis  of  no  change  the  asymptotic  distribution  of  n  sC~(K,  J)  is  Normal(0,l). 
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ABSTRACT 

Degraded-states  methodology  represents  a  fundamental  change  in  the  procedure  for 
acgpgging  the  vulnerability  of  armored  fighting  vehicles.  Results  of  such  assessments 
serve  as  input  to  Army  wargame  models — large  stochastic  computer  simulations  which, 
for  a  given  scenario,  calculate  the  number  of  kills  for  both  friendly  and  enemy  forces. 
In  an  attempt  to  determine  the  ultimate  effect  of  this  new  methodology,  one  such 
wargame  was  modified  to  accept  the  degraded-states  input  Three  hundred  replications 
of  the  model  were  run  using  conventional  input  degraded-states  input,  and  some 
aggregate  of  the  two.  Outputs  from  the  wargame,  in  the  form  of  partial  kills,  complete 
iritis,  and  ratio  of  kills,  were  compared  using  contingency  tables  to  determine  whether 
differences  in  the  results  were  statistically  significant. 


I.  INTRODUCTION 

Degraded-states  methodology  represents  a  fundamental  change  in  the  procedure  for  assessing  the 
vulnerability  of  armored  fighting  vehicles.  The  traditional  metric  for  vulnerability  analysis  was  derived 
from  a  mapping  procedure  applying  the  Standard  Damage  Assessment  List  (SDAL)  to  the  calculated 
Hatnagp  state  of  the  vehicle.  However,  studies  had  shown  that  the  theory  behind  this  metric  was 
problematical  in  a  conceptual  sense.1,2  The  degraded-states  metric  was  proposed  as  an  alternative 
procedure  (more  appealing  because  the  mathematical  foundation  is  more  rigorous).3 


1  R°pp.  J.  R.  "An  Investigation  of  Alternative  Methods  for  Estimating  Armored  Vehicle  Vulnerability,"  BRL-MR-03290, 
U.S.  Army  Ballistic  Research  Laboratory,  Aberdeen  Proving  Ground,  MD,  July  1983. 

1  Starks,  M.  W.  "New  Foundations  for  Tank  Vulnerability  Analysis."  The  Proceedings  of  the  Tenth  Annual  Symposium  on 
Survivability  and  Vulnerability  of  the  American  Defense  Preparedness  Association  (ADPA),  Naval  Ocean  Systems  Center, 
San  Diego,  CA,  10-12  May  1988. 

» Abel,  J„  L.  Roach,  and  M.  Starks.  "Degraded-States  Vulnerability  Analysis."  BRL-TR-3010,  U.S.  Army  Ballistic  Research 
Laboratory  Aberdeen  Proving  Ground,  MD,  June  1989. 
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An  immediate  concern  was  how  the  vulnerability  results  obtained  using  degraded-states  methodology 
could  be  incorporated  into  the  large  wargames  so  prevalent  in  U.S.  Army  studies,  since  these  results  did 
not  conform  to  those  from  traditional  analyses.  Accordingly,  the  U.S.  Army  Ballistic  Research  Laboratory 
(BRL)  and  the  U.S.  Army  Material  Systems  Analysis  Activity  (AMSAA)  commenced  a  joint  program  to 
further  develop  and  implement  the  degraded-states  methodology.  Consequently,  the  Degraded-States 
Weapons  Analysis  Research  Simulation  (DSWARS)  was  written.4  It  is  an  adaptation  of  a  previous 
stochastic  ground-combat  simulation  and  will  accept  as  input  the  degraded-states  metric  as  well  as  the 
traditional  SDAL-based  metric. 

For  a  specific  set  of  scenarios,  DSWARS  was  run  in  three  different  modes— standard  damage 
assessment  list,  degraded  states,  and  some  aggregate  derived  from  the  first  two.  AMSAA  and  BRL  were 
interested  in  whether  the  differences  in  results  were  statistically  significant,  and  that  is  the  question  which 
will  be  addressed  in  this  paper. 


II.  WARGAME  RESULTS 

DSWARS  was  run  at  two  visibility  ranges  using  two  types  of  attacker  bullets  and  two  types  of 
defender  bullets.  Thus,  eight  (23)  different  scenarios  were  examined.  The  simulation  was  run  300  times 
for  each  scenario  using  each  methodology — a  total  of  7,200  runs.  Summary  results  are  shown  in  Table  1 . 
The  first  column  indicates  the  scenario,  where  the  first  digit  represents  the  attacker  bullet  (1  =  penetrator, 
2  =  nonpenetrator),  the  second  digit  represents  the  defender  bullet,  and  the  third  digit  represents  the  range 
of  visibility  (3  km,  7  km).  A  penetrator  bullet  will  penetrate  frontal  armor,  a  nonpenetrator  bullet  will 
not  but  may  penetrate  other  armor.  The  second  column  indicates  the  methodology  employed,  while  the 
third  column  lists  the  side.  Blue  is  always  defending  with  three  armored  vehicles  in  a  hull-defilade  mode, 
and  red  is  always  attacking  with  nine  armored  vehicles  in  a  fully-exposed  mode.  The  simulation 
commenced  at  a  range  of  4,000  m  and  continued  until  the  nearest  attacker  closed  to  within  500  m  of  any 
defender. 


The  results  presented  are  for  a  firepower  kill  on  the  armored  vehicles.  The  table  shows  the  number 
of  complete  kills  for  both  red  and  blue.  The  degraded-states  methodology  calculates  partial  kills  on 
vehicles,  and  so  the  column  labeled  any"  is  the  sum  of  the  complete  and  partial  kills.  Notice  that  the 


4  Comstock,  G.  R.  "The  Degraded-Stales  Weapons  Analysis  Research  Simulation  (DSWARS):  An  Investigation  of  the 
Degraded  States  Vulnerability  Methodology  in  a  Combat  Simulation."  TR-495,  U.S.  Army  Materiel  Systems  Analysis  Activity. 
Aberdeen  Proving  Ground,  MD,  February  1991. 
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"complete"  column  is  identical  to  the  "any"  column  for  the  remaining  methodologies.  The  final  two 
columns  are  the  ratios  of  red  kills  to  blue  kills  for  both  "complete"  and  "any." 

I  was  asked  to  compare  the  three  methodologies  in  a  pairwise  fashion  for  four  different  measures. 

1)  Attacker  and  defender  complete-firepower  kills,  2)  attacker  and  defender  any-firepower  kills,  3)  the  ratio 
of  attacker  complete-firepower  kill  to  defender  complete-firepower  kill,  and  4)  the  ratio  of  attacker  any- 
firepower  kill  to  defender  any-firepower  kill.  Merely  comparing  the  summary  results  would  not  provide 
a  clear  indication  of  how  DSWARS  was  reacting  to  the  different  methodologies;  while  the  average  number 
of  kills  might  be  close  over  the  300  runs,  individual  replications  could  be  considerably  different  There 
were  also  advantages  to  comparing  each  scenario  individually.  The  results  might  be  similar  at  the  longer 
visibility  range  but  disagree  significantly  at  the  shorter  one.  Therefore,  I  requested  the  intermediate  results 
(i.e.,  the  number  of  red  kills  and  the  number  of  blue  kills  in  each  replication  for  each  scenario). 
Furthermore,  since  these  numbers  are  not  independent  the  data  were  categorized  into  the  number  of  "x" 
blue  kills  and  "y"  red  kills,  where  x  =  0, 1, ....  3  and  y  =  0, 1, ....  9. 

Intermediate  results  for  the  ratio  data  were  a  bit  more  difficult  to  separate.  In  an  individual 
replication,  if  the  number  of  blue  kills  totaled  zero,  then  the  ratio  of  red  kills  to  blue  kills  went  to  infinity. 
Furthermore,  there  were  a  large  number  of  different  ratios  that  were  possible.  Therefore,  the  intermediate 
results  for  ratios  were  grouped  into  ten  specified  intervals  (0.O-O.9, 1 .0-1.9, ....  8.0-8.9,  9.0  and  over), 
placing  those,  ratios  equal  to  infinity  into  the  top  interval. 

in.  STATISTICAL  ANALYSIS 

In  attempting  to  evaluate  the  consistency  of  DSWARS  results  under  the  three  different  methodologies, 
I  have  used  the  chi-square  test  for  differences  in  probabilities  within  contingency  tables  to  determine 
whether  or  not  the  differences  in  output  are  statistically  significant  This  procedure  can  be  found  in  most 
elementary  statistical  textbooks.5  The  chi-square  test  is  used  to  test  a  hypothesis— in  this  case,  that  there 
is  no  difference  between  results  obtained  using  the  SDAL  method,  the  degraded-states  method,  and  the 
aggregate  method.  These  three  methods  can  be  considered  populations,  and  output  from  the  300 
replications  of  DSWARS  can  be  considered  random  samples  from  these  populations. 


5  Conover,  W.  J.  Practical  Nonoarametric  Statistics.  New  York:  John  Wiley  &  Sons,  1971. 
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Table  2  displays  the  contingency  We  torn  the  BMDP  statistical  software  package  for  one  of  Us 

“  “*  any'fireP°Wr  “  «*  SDAL  —*  -  to  degraded-states  metired 

n>ws  "present  the  different  methodologies;  the  columns  are  bins  that  represent  intervals  (xjt  +  0  9 

far  x  =  01  9,  into  which  the  ratio  of  red  kills  to  blue  kills  can  conceivably  fall.  The  top  bin  include; 

OS  e  as  infinity.  When  examining  firepower  kills  rather  than  ratios,  the  columns  of  the 
contingency  tables  are  bins  that  represent  "x"  blue  kills  and  "y"  red  kills  for  x  *  0, 1 . 3andy  =  0  1 

9*  Reca“that  blue  and  red  kills  arc  combined  in  the  tables  since  they  are  not  independent  (i.e. 

of  roJs"Z  co1!  °f  “a?B‘  W01lld  taply  2  “"deney  faf  *  small  number  of  blue  kills).  The  intersection 

Z  T a*aWS' !am  **  — >  -  —  -  occunences  mg  of  300  repUcations  tire,  such 

a  column  output  results  from  such  a  "row"  method. 


Three  assumptions  should  be  satisfied  when 


using  the  chi-square  test  for  differences 


in  probabilities: 


(1)  each  sample  is  a  random  sample, 

(2)  the  various  samples  are  all  mutually  independent,  and 

(3)  each  observation  falls  into  exactly  one  cell 


T2T  T  *  n“10mIy  Sdmed  ValUe  *“  lta  1  *  ~  Tails  into  the  j  th  bin 

^  Ts TT  “  ^  ta  **  S”'  °f  «“  — W  table  are  equal’ 

(2  „  r  **  *  d”te“  are  identical).  The  expeoed  number 

bservanons  for  each  cell,  assuming  a  hue  null  hypoflresis,  is  calcubued.  The  res,  sretistic  is  delined  as 


where 


and 


E  E 

i*l  /■! 


<£a  -  V 

K 


r  »  number  of  rows 

c  =  number  of  columns 

Ojj  =  number  of  actual  observations  in  cell  (ij) 

Ey  =  number  of  expected  observations  in  cell  (ij) 


(1) 


(2) 
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where 


n.  =  number  of  observations  in  population  i 
Cj  -  number  of  observations  in  bin; 

N  —  total  number  of  observations. 

This  test  statistic  is  computed  and  subsequently  compared  with  the  chi-square  distribution. 

The  chi-square  distribution  is  used  as  a  large-sample  approximation,  since  the  exact  distribution  of  the 
test  statistic  is  difficult  to  determine.  A  widely  held  belief  is  that  this  approximation  is  good,  provided 
that  <  20%  of  the  cells  have  an  expected  number  of  observations  <  5.  If  this  is  not  true,  or  if  there  are 
cells  where  the  expected  number  of  observations  is  zero,  then  several  categories  should  be  combined  to 

overcome  the  problem. 

The  results  arc  shown  in  Table  3.  The  left  half  of  the  table  peitains  to  kills,  both  complete-firepower 
(CF)  and  any-firepower  (AF>.  The  tight  half  of  the  table  peitains  to  kill  ratios,  complete-firepower  (CFX) 
and  any-firepower  (AFX).  Each  number  in  the  table  represents  a  p-value,  which  is  the  probability  of  being 
in  enor  if  the  null  hypothesis  is  rejected.  In  other  words,  if  the  methods  arc  equivalent,  the  prcbability 
of  seeing  differences  equal  to  or  greater  than  those  observed  In  these  samples  of  300  is  equal  to  p.  A 
p-value  £  0.05  would  generally  lead  to  rejection  of  the  null  hypothesis.  This  would  lead  to  the  conclusion 
duu  results  obtained  using  the  different  methodologies  arc  dissimilar.  Therefore,  the  mefltodologies  should 
not  be  mixed  in  DSWARS,  since  it  would  be  unclear  whether  differences  in  any  subsequent  compansons 
of  war-game  results  were  true  differences  or  merely  a  manifestation  of  these  dissimilar  methodologies. 
The  -*■■■—  labeled  -overall-  in  the  table  refers  to  a  single  comparison  of  all  three  methods;  columns  to 
me  right  show  the  results  of  pairwise  comparisons.  The  results  for  the  kill  ratios  seem  to  mimic  those  for 
the  wills  themselves,  with  the  notable  exception  of  scenario  213  AF. 

In  summary,  the  p-values  indicate  that  tor  complete-firepower  kills,  the  three  methods  generally  give 
different  results.  For  any-firepower  kills,  the  p-values  are  slightly  larger,  but  me  general  conclusions 
remain  unchanged.  In  each  case,  the  SDAL  method  and  the  aggregate  method  seem  to  agree  more  often 
than  the  other  two  pairs — but  still  not  consistently. 
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IV.  PROBLEM 


ecide d  to  compare  the  imenuediate  results  produced  by  DSWARS  because  1  was  concerned 
mere  y  examining  the  final  results  might  mask  some  subtle  difference  in  methodologies  1  also 
mm  u  reasonable  re  compare  e*h  scenario  individually,  hoping  this  might  prevkle  some  additional 

ft  „  "m'ti,n'S  •»  odwr  than  similarity  of  the  medredologies 

Eventually,  such  a  problem  arose.  logies. 

re  °'  ™  *  degraded-states  medred  provided,  wha, .  ta  appeared 

1  3  [Table  TZL  XT°  123  CF'  “*  “mplett-flreP~CT  “  —  «  close  (i.e„  1.4  and 
■  ]).  For  this  case,  the  p-,alue  was  0.013  [Table  3).  indicating  that  the  null  hypothesis  of  no 

SAP.  where  me  any-llrepower  kill  rarios  were  quire  different  (3.5  and  4.3);  bu,  me  p-value  was  0.313 
unplymg  that  the  null  hypothesis  should  not  be  rejected. 

After  discussing  these  outcomes  with  the  developer  of  DSWARS,  I  realised  that  the  kffl  rados  were 
calculated  differently  horn  wha,  ,  had  expected.  While  dre  final  results  for  kids  are  TZy  Z 

“7!“  resuhs  averaged  over  dre  number  of  replicadons.  dre  Anal  mdos  are  no,  such  averages  but 
sunply  die  redo  of  die  final  kill  results.  In  other  worts.  I  expected 


Ratio  *  [rtj/Bj  +  R2IB2  +  ...  *RJBH]/n 
*  RJnBx  +  R2/nB2  +  ...  +  r jnB 


(3) 


Instead,  I  received 


Ratio  -  Ifj  *  Hi  *  -  * 

[Bt  +B2  +  ...  +  BJ/n 

.  *i  +  *2  ♦  -  * 

T~nr~i ...  +  ■ 


(4) 
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Unless  the  number  of  blue  Mils  is  the  same  in  each  replication,  Equation  (3)  and  Equation  (4)  will  give 
different  results  for  the  ratio.  If  the  individual  kill  ratios  (RJB?)  include  high  values.  Equation  (4)  will  be 
less  than  Equation  (3).  Of  course.  Equation  (3)  breaks  down  when  an  intermediate  ratio  becomes  infinity. 
Apparently,  the  pros  and  cons  of  these  two  equations  had  been  discussed  years  ago;  Equation  (4)  remains 
the  procedure  of  choice  for  this  simulation. 

Table  4  supports  the  p-value  from  Table  3.  For  the  case  of  interest,  it  shows  means  and  standard 
deviations  of  the  number  of  kills  along  with  99%  confidence  intervals.  The  first  column  represents  the . 
scenario,  while  the  second  indicates  the  side  (blue  or  red),  and  the  third  indicates  the  type  of  kill  (CF  or 
AF).  The  final  two  groups  of  columns  represent  the  SDAL  method  and  the  degraded-states  method. 
Notice  that  for  scenario  123  CF,  the  99%  confidence  intervals  for  mean  blue  kills  do  not  overlap.  For 
mean  red  kills,  the  confidence  intervals  overlap  slightly.  This  is  an  indication  that  the  two  methodologies 
disagree  in  terms  of  the  number  of  kills.  In  spite  of  this,  the  ratio  of  their  means  is  close.  The 
nonoverlapping  confidence  intervals  lend  credence  to  the  low  p-value  of  0.013.  Now  look  at  scenario 
223  AF.  The  99%  confidence  intervals  overlap  in  both  cases,  even  though  the  ratio  of  their  means  is  quite 
different  Recall  that  the  p-value  for  this  case  was  0.313,  again  intuitive  based  on  the  overlapping 
confidence  intervals. 

Kill  ratios  appear  to  be  the  most  important  output  of  the  ground-combat  simulations.  It  is  these 
exchange  ratios  that  the  decision  makers  want  to  see.  Therefore,  it  was  particularly  desirable  to  evaluate 
how  they  are  affected  by  the  different  methodologies.  Since  the  kill  ratios  calculated  in  the  individual 
replications  are  not  used  in  the  evaluation  of  the  final  ratios,  the  contingency  table  approach  was 
inappropriate.  Given  the  procedure  for  calculating  such  ratios,  a  pairwise  comparison  would  seem 
reasonable,  but  then  the  methodologies  must  be  compared  over  all  scenarios.  Also,  in  using  these  data, 
there  are  only  eight  comparisons,  indicating  that  the  statistical  test  may  not  have  much  power  (i.e.,  may 
not  have  great  ability  to  detect  a  false  null  hypothesis).  I  did  use  the  Wilcoxon  signed-ranks  test  to 
examine  the  null  hypothesis  between  the  SDAL  method  and  the  degraded-states  method.  The  p-values 
were  0.484  for  the  CF  ratio  and  0.050  for  the  AF  ratio. 

V.  CONCLUSIONS 

DSWARS,  a  stochastic  ground-combat  simulation,  was  written  to  accept  input  from  three  different 
methods  of  vulnerability  assessment— the  standard-damage-assessment-list  method,  the  degraded-states 
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method,  and  an  aggregate  method  combining  portions  of  the  other  two.  I  used  statistical  procedures  to 
test  the  hypothesis  that  there  is  no  difference  in  results  from  using  the  various  methodologies.  Three 
hundred  replications  were  run  using  each  methodology  for  each  of  eight  different  scenarios — a  total  of 
7,200  runs. 


The  numbers  of  red  kills  and  blue  kills  were  compared  for  individual  scenarios  using  contingency 
tables.  The  intermediate  (individual  replication)  results  were  used  to  test  the  null  hypothesis.  Because 
kill  ratios  are  merely  a  function  of  final  kill  results,  no  intermediate  kill  ratios  could  be  used.  For  this 
case,  I  employed  the  Wilcoxon  signed-ranks  test  to  test  the  null  hypothesis.  However,  it  was  necessary 
to  make  pairwise  comparisons  over  all  scenarios,  and  the  paucity  of  the  data  (only  eight  different 
scenarios)  lessens  the  confidence  in  the  test  results. 

The  primary  conclusion  is  that  the  SDAL  method,  the  degraded-states  method,  and  the  aggregate 
method  produce  different  results  in  DSWARS,  especially  when  examining  the  number  of  kills  for  both 
red  and  blue.  (Exchange  ratio  results  are  inconclusive  and  would  benefit  from  additional  data.)  Of  course, 
this  is  not  always  the  case;  there  are  scenarios  where  they  agree  quite  well.  However,  in  general,  we 
should  guard  against  mixing  results  from  these  three  methodologies. 
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Table  1.  DSWARS  Results  -  Firepower  Kill 


Table  1.  DSWARS  Results  -  Firepower  Kill  (Continued) 


Table  2.  BMDP  Output  for  Contingency  Table  Example 
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Table  3.  DSWARS  Comparisons:  P-Values 


Table  4.  Selected  Confidence  Intervals 


SDAL 

DS 

Scenario 

Side 

Type 

X 

S 

99%  a 

X 

S 

99%  a 

123 

B 

CF 

251 

0.72 

(2.46, 2.68) 

225 

0.82 

(2.13, 2.37) 

123 

R 

CF 

3.52 

225 

(3.19,  3.85) 

2.92 

1.99 

(2.62,  3.22) 

223 

B 

AF 

1.65 

0.93 

(131, 1.79) 

1.44 

0.92 

(130, 1.58) 

223 

R 

AF 

5.76 

2.02 

(5.46,  6.06) 

621 

1.80 

(5.94. 6.48) 
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VARIANCE  COMPONENTS  IN  TANK  GUN  ACCURACY  RESEARCH 


David  W.  Webb 
Probability  &  Statistics  Branch 
Systems  Engineering  &  Concepts  Analysis  Division 
U.S.  Army  Ballistic  Research  Laboratory 
Aberdeen  Proving  Ground,  MD  21005-5066 
webb@brl.mil  (410)  278-6646 


Abstract 

Using  a  nested-factorial  design,  the  Ballistic  Research  Laboratory 
conducted  a  480-round  test  to  determine  if  a  proposed  manufacturing 
process  referred  to  as  dynamic  indexing  would  reduce  tube-to-tube 
variability,  cr|uie,  of  the  U.S.  Army’s  M256  tank  cannon.  For  each  of  four 
different  120-mm  ammunition  types  and  for  both  horizontal  and  vertical 
axes,  variance  component  estimates  of  Cyufe  were  calculated  for 
dynamically  indexed  tubes  and  for  standard  tubes.  Using  an  indirect 
hypothesis  test,  estimators  for  the  two  tube  types  were  compared.  The 
analysis  showed  that  dynamic  indexing  does  not  reduce  <7j-ute  as  hoped. 


I.  INTRODUCTION 

Through  the  analysis  of  several  statistically  designed  experiments,  researchers  at 
the  U.S.  Army  Ballistic  Research  Laboratory  (BRL)  have  learned  that  tube-to-tube 
variability*  is  one  of  the  major  contributors  to  the  overall  variation  in  jump  for  the 
MlAl  Tank  System.  Jump  is  defined  as  the  difference  between  the  aimpoint  and  impact 
location  after  corrections  have  been  made  for  all  known  sources  of  error  (e.g.,  muzzle 
velocity  and  wind  conditions). 

In  an  ongoing  effort  to  reduce  tube-to-tube  variation  in  the  MlAl,  a  concept  refered 
to  as  dynamic  indexing  was  developed  for  the  M256  cannon.  A  brief  description  of  this 
concept  follows.  Using  data  describing  the  gun  dynamics  during  firing,  a  reference  profile 
is  defined  that  compensates  for  tube  droop  and  minimizes  the  perturbations  to  sabot 
projectiles  at  ambient  temperature.  This  differs  from  the  standard  indexing  procedure  in 
which  the  reference  profile  only  compensates  for  tube  droop.  Under  both  indexing 
procedures,  measurements  of  the  tube  centerline  profile  are  taken  and  the  tube 
mathematically  rotated  until  the  best  match  to  the  reference  profile  is  obtained.  Once 
this  rotation  is  accomplished,  the  muzzle  upstand,  breech-interrupted  threads,  and  breech 
locking  locking  plug  are  machined,  thus  irrevocably  defining  "up"  for  the  gun  tube. 

As  part  of  a  prototype  test,  five  gun  tubes  were  randomly  selected  from  the 
production  line  at  Watervliet  Arsenal  and  were  dynamically  indexed.  These  tubes  were 


*  "Tube-to-tube  variability"  is  a  term  used  by  researchers  in  the  tank  gun  accuracy  community.  It  is 
equivalent  to  what  statisticians  might  refer  to  as  "between-tube  variability." 
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shipped  to  Aberdeen  Proving  Ground  and  fired  from  a  single  M1A1  Tank  using  105 
rounds  of  120-mm  ammunition.  The  results  of  this  test  showed  "a  tendency  to  reduce 
tube-to-tube  variability,  providing  an  improvement  in  fleet  hit  probability"  (Schmidt,  et. 
al.,  1989).  Based  on  these  initial  results,  a  more  complete  proof-of-principle  test  was 
recommended. 

H.  TEST  PLAN 

The  primary  objective  of  the  test  was  to  determine  if  the  tube-to-tube  variability  of 
dynamically  indexed  tubes  (DITs)  is  statistically  smaller  than  the  tube-to-tube  variability 
of  standard  tubes  (STs).  In  other  words,  "Does  dynamic  indexing  of  the  M256  cannon 
reduce  the  dispersion  of  centers-of-impact?"  (see  Figure  1). 

The  Probability  &  Statistics  Branch  of  the  BRL  devised  several  experimental  design 
plans  and  presented  them  before  the  Project  Manager  for  Tank  Main  Armament  Systems. 
The  relative  merits  of  the  design  plans  were  discussed  with  concerns  expressed  for 
economy  of  available  resources  and  robustness  of  test  conditions. 

Ultimately,  one  of  the  plans  was  selected  and  slightly  modified  for  a  detailed 
comparison  of  DITs  and  STs.  The  design  called  for  four  types  of  120-mm  ammunition, 
twenty  DITs,  and  twenty  STs.  To  make  the  experiment  and  the  results  as  robust  as 
possible,  it  also  called  for  four  tanks  and  four  ammunition  temperatures.  However,  in 
order  to  reduce  the  amount  of  testing,  the  tank  and  ammunition  temperature  factors 
were  confounded.  Therefore,  if  this  confounded  factor  was  found  to  be  statistically 
significant,  one  would  be  unable  to  distinguish  between  the  Tank  effect,  the  Ammunition 
Temperature  effect,  or  their  interaction.  For  each  treatment  combination,  three  rounds 
were  fired.  The  complete  test  design  matrix  is  shown  in  Figure  2. 

m.  DATA  MATRIX 

The  trajectory  of  each  round  was  monitored  at  four  different  ranges:  800  m,  1500 
m,  2400  m,  and  3000  m.  From  measurements  taken  at  these  ranges,  azimuth  and 
elevation  jumps  were  computed  and  recorded.  For  various  reasons,  such  as  short-landing 
rounds  or  range  equipment  failures,  the  jumps  were  not  always  obtained  at  each  of  the 
four  ranges.  The  percentage  of  missing  data  at  the  1500-m  and  3000-m  targets 
precluded  any  formal  analysis  at  these  ranges.  Exploratory  analysis  and  the  computation 
of  simple  descriptive  measures  of  the  800-m  and  2400-m  data  indicated  that  the  2400-m 
azimuth  jumps  were  strongly  affected  by  wind.  Therefore,  the  analysis  was  concentrated 
on  the  data  recorded  at  800  m,  since  at  this  shorter  range  the  data  are  believed  to  be  less 
influenced  by  wind  and  other  flight  conditions. 

Using  a  procedure  that  relies  on  the  assumption  that  rounds  fired  on  the  same 
occasion  follow  a  similar  trajectory  profile,  jump  estimates  were  made  for  those  rounds  in 
which  the  800-m  field  data  were  missing. 

Because  of  the  substantial  differences  (e.g.,  aerodynamics  and  threat  capabilities) 
between  the  ammunition  types,  the  comprehensive  data  set  was  divided  into  four  120- 
round  subsets  to  be  analyzed  separately.  Furthermore,  the  azimuth  and  elevation  jump 
values  were  assumed  to  be  independent,  so  that  separate  analyses  were  performed  on  the 
azimuth  jumps  and  the  elevation  jumps. 
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Hypothetically,  dynamic  indexing  of  the  M256  tank  cannon  lowers 
tube-to-tube  variability,  thereby  reducing  the  dispersion  of  the 
centers-of-impact. 
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Figure  L  The  Goal  of  Dynamic  Indexing 
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Figure  2.  480-Round  Test  Design 


With  simple  exploratory  procedures,  some  potential  outliers  in  the  data  were 
flagged.  After  conferring  with  test  directors  and  other  researchers  involved  in  the  study, 
some  of  these  jump  values  were  corrected  while  others  remained  unchanged.  No  round 
was  deleted  from  the  analysis  just  because  it  seemed  to  be  a  "flier".  All  jump  values  were 
eventually  accepted  as  accurate  and  included  in  th'e  analysis. 

IV.  STATISTICAL  ANALYSIS 

The  following  discussion  will  examine  how  the  comparison  between  DIT  and  ST 
tube-to-tube  variability  was  made  for  any  of  the  eight  (four  ammunitions  for  azimuth  and 
elevation)  subsets  of  the  entire  data  set. 

The  analysis  strategy  was  to  obtain  an  estimate  of  the  tube-to-tube  variance  for 
those  rounds  fired  from  DITs,  and  to  compare  it  with  the  tube-to-tube  variance  estimate 
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for  STs,  (i.e.,  d2Tule.DIT  versus  o\uu-St)-  This  required  dividing  the  data  in  half  into  60 
DIT  rounds  and  60  ST  rounds.  The  test  design  for  such  a  set  of  60  jump  values  appears 
in  Figure  3.  The  model  for  the  standard  tubes  azimuth  data,  for  example,  is 

xijk  =  V  +  Qi  +  £;(i)  +  ek{ij) 


where, 
xijk  = 
H  = 

Qi  = 
&}{*)  ~ 
€m  = 


jump  of  kth  round  from  jtk  standard  tube  on  ith  tank/temp,  measured  in  mils; 
overall  mean; 

effect  of  ith  tank/temp,  i  =  1,  2,  3,  4; 

effect  of  jth  standard  tube  on  iih  tank/temp,  j  =  1,  2,  3,  4,  5; 

error  of  kth  round  from  jth  standard  tube  on  ith  tank/temp,  k  —  1,  2,  3. 
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Figure  3.  60-jump  test  design  from  which  the  variance  component 
estimate  <7yute  is  derived.  For  each  combination  of  Direction  (azimuth  or 
elevation),  Tube  Type  (DIT  or  ST),  and  Ammunition  Type  (1,  2,  3,  or  4), 
the  test  design  is  as  shown  above  where  each  "x"  represents  a  jump  value. 

The  confounded  factor,  Tank/Temp,  and  the  nested  factor,  Tube,  were  each 
considered  to  be  random.  Furthermore,  each  Qit  and  e^-^was  assumed^to  be 

normally  distributed  with  mean  zero  and  variance  cr Tank/ Temp >  a Tube-ST>  and  aRnd-ST> 
respectively.  24Q 


For  an  analysis-of-variance  under  these  model  assumptions,  the  degrees-of-freedom 
and  expected  mean-squares  (EMS)  associated  with  each  factor  are: 

Source  df  _ EMS _ 

Tank/Temp  3  c<Rnd-ST  +  3cr Tube-ST  +  15(7 Tank/Temp 

^Uke  16  aRnd-ST  +  Tube-ST 

Round  40  •  a\ni.ST 


The  method-of-moments  estimator  for  the  variance  component  ^j-ube-ST  is  given  by 


-2 


Tube-ST 


M^Tube-ST  ~  MSRnd.5T 


Likewise,  for  the  DIT  data,  an  estimator  for  &  Tube -DIT  is 


Tube-DIT  ~ 


MS  Tube -DIT  ~  MSRni.DJT 


Once  each  estimate  was  computed,  there  was  still  the  task  of  how  to  determine  if 
there  was  a  significant  reduction  in  tube-to-tube  variance  with  dynamic  indexing. 
Unfortunately,  this  could  not  be  done  directly  since  the  distributions  of  each  o^be-DlT 
and  & Tube-ST  are.extremely  complex  and  there  was  no  known  method  for  comparing  two 
random  variables  coming  from  such  distributions. 

An  indirect  test  for  the  equality  variance  components  was  needed,  and  after 
conferring  with  several  academicians  a  novel  approach  was  attempted  (Mathew  1991). 
For  each  data  set,  the  round- to- round  variability  is  denoted  by  either  cr%ndDIT  or  er|nrf.sr 
depending  upon  the  tube  type.  While  there  is  no  reason  to  believe  that  dynamic  indexing 
has  any  effect  on  the  round-to-round  variance,  this  assumption  can  be  tested.  Estimators 
f°r  aRni-DiT  anc*  aRnd-ST  are  MSRnd_DIT  and  MSRnd.ST,  which  are  multiples  of  x2  random 
variables  with  40  degrees  of  freedom.  Therefore, 


_  ^Rnd  - DIT 

MSRni.ST/4  0  MSRnd 

-ST 


is  a  test  statistic  for  the  two-tailed  hypothesis  test 


Ho-  aRnd-DIT  ~  aRnd-ST>  against 


Hy.  a 


Rnd-DIT 


Rnd-ST- 


MSTllhe.DIT/U)  _  MSTuhe_D]T 

MSTube_sT/l6  MS  Tube -ST 
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Next,  note  that 


T> 


is  a  test  statistic  for  the  hypothesis  test 

H0-  aRnd-ST  +  3a Tube -st  —  aRnd-DiT  +  3a\uie_DJT,  against 
Ha-  aRnd-ST  +  3a\nle_ST  >  o\nd_mT  +  3o\uh(_DIT. 

However,  if  the  ratio  F  =  MSRni.DITjMSRni.ST  is  insignificant,  then  F'  becomes  an 
indirect  test  statistic  for  the  equality  of  tube-to-tube  variability;  that  is, 
F'  =  MSTut,e_ST/MSTube.DIT  tests  the  hypothesis 

Ho'-  aTube-ST  =  aTube-DiT>  against 

nA  *  aTube-ST  >  aTube-DIT * 

On  the  other  hand,  if  the  test  for  equality  of  round-to-round  variabilities  is 
significant,  then  this  would  contaminate  the  indirect  approach  suggested  above.  In  such 
case,  graphical  comparison  of  the  variance  components  estimates,  ^Tube-DIT  an<^  ^Tube-ST 
with  other  pairs  of  estimates  that  are  known  to  differ  could  provide  a  subjective 
evaluation  of  &Tube-DIT  anc^  aTube-ST- 

V.  RESULTS 

Due  to  the  classification  of  this  subject  matter,  numerical  results  cannot  be 
divulged.  Qualitatively,  the  overall  conclusions  were  quite  surprising  and  somewhat 
disappointing.  For  the  120-mm  ammunition  type  that  was  used  in  the  prototype  test, 
which  showed  that  dynamic  indexing  resulted  in  a  lowering  of  <7j uie,  the  azimuth  and 
elevation  tube-to-tube  variance  estimates  were  statistically  greater  with  DITs  than  they 
were  with  STs.  The  reason  for  this  is  still  unknown. 

Among  the  other  120-mm  ammunition  types,  only  one  showed  a  significant 
reduction  in  <7\ube  with  dynamically  indexed  tubes  and  that  occurred  only  in  azimuth.  All 
other  cases  were  insignificant. 
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ABSTRACT 

Uncontrolled  sources  of  variability  during  experimentation  may  mask 
significant  results  and  may  consequently  hinder  the  ability  of  the  Department 
of  Defense  (DoD)  decision  makers  to  draw  conclusions  concerning  weapon  systems 
or  standing  operating  procedures  (SOPs) .  To  protect  against  drawing  the  wrong 
conclusions,  one  must  use  experimental  methodology  that  reduces,  identifies, 
or  controls  sources  of  variability.  To  illustrate  this  point,  a  field 
exercise,  which  is  designed  to  determine  how  quickly  and  accurately  soldiers 
can  identify  enemy  ordnance  using  a  prototype  expert  system  compared  to  the 
standard  60  series  technical  manual,  is  presented,  along  with  results  and 
conclusions.  A  design  strategy  that  allows  experimenters  to  estimate  and  test 
these  uncontrolled  sources  of  variability  is  provided. 


INTRODUCTION 

Since  decision  making  based  on  the  use  of  statistical  tools  almost 
always  involves  collection  of  data,  the  way  in  which  the  data  are  collected 
becomes  extremely  important.  The  design  of  an  experiment  has  been  defined 
very  simply  as  the  order  in  which  a  combination  of  experimental  variables  is 
run  or  controlled. 

Experimental  designs  are  employed  in  the  Army's  research,  developmental  * 
testing,  and  evaluation  efforts  to  assure  that  unbiased  and  correct  decisions 
are  made  regarding  equipment,  SOPs,  and  weapon  systems. 

Variation  produced  by  disturbing  factors,  both  known  and  unknown,  is 
called  experimental  error.  Important  effects  may  be  wholly  or  partially 
obscured  by  experimental  error.  Conversely,  through  experimental  error,  the 
experimenter  may  be  misled  into  believing  in  effects  that  do  not  exist. 
Experimental  designs  are  used  to  help  reduce  the  experimental  error  in  the 
data  collected.  Randomization  or  counterbalancing  is  employed  to  try  to  help 
average  the  effect  of  many  extraneous  factors  which  may  be  present  in  an 
experiment . 

Unfortunately,  uncontrolled  sources  of  variability  during 
experimentation  may  mask  significant  results  and  may  consequently  hinder  the 
ability  of  the  Department  of  Defense  (DoD)  decision  makers  to  draw  conclusions 
concerning  weapon  systems  or  SOPs.  To  protect  against  drawing  the  wrong 
conclusions,  one  must  use  experimental  methodology  that  reduces,  identifies, 
or  controls  sources  of  variability.  One  tool  is  to  use  "Model-Based 
Diagnostics"  associated  with  variance-component  estimations  to  assess  the  data 
and  the  experimental  model  as  proposed  by  Hocking  (1985),  demonstrated  by 
Grynovicki  and  Green  (1988),  and  published  by  Hocking,  Green,  and  Brener 
(1989),  and  Grynovicki  (1990). 
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These  new  closed  form  expressions  for  the  estimators  of  variance 
components  have  been  developed,  based  on  the  equivalence  shown  in  Hocking, 
Green,  and  Brener  (1990)  of  the  variance  component  estimation  problem  to  the 
problem  of  estimating  the  covariance,  0^,  between  appropriately  related 
observations.  In  addition,  these  estimators  have  been  shown  to  provide 
information  that  will  be  useful  in  diagnosing  problems  and  to  suggest  simple 
graphical  procedures  for  examining  the  influence  of  the  treatment  levels. 
(Grynovicki,  and  Green  (1988)). 

To  illustrate  this  point,  a  field  exercise  designed  to  determine  how 
quickly  soldiers  can  identify  enemy  ordnance,  using  a  prototype  expert  system 
compared  to  the  standard  60  series  technical  manual,  is  presented,  along  with 
results,  diagnostics,  and  conclusions.  A  design  strategy  that  allows 
experimenters  to  estimate,  test,  and  identify  these  uncontrolled  sources  of 
variability  is  provided. 


GENERAL  VARIANCE  COMPONENT  ESTIMATES 
AND  MODEL  DIAGNOSTIC  METHODOLOGY 

Before  introducing  the  field  study,  a  general  description  of  the  model 
diagnostics  is  provided.  For  brevity,  this  discourse  is  limited  to  a 
hierarchical  model  with  factor  2  random  and  nested  in  factor  1  and  crossed 
with  factor  3.  All  other  factors  will  be  considered  fixed. 

The  number  of  levels  of  factor  (i)  is  designated  by  ai.  The  traditional 
univariate  repeated  measures  mixed  model  is 

Yijkm  *  M+Ai+^S j  (£) +C)c+ACi)t+CAS}c j  (i) +E  m  2.1 

in  which  is  the  grand  mean,  Ai  is  the  effect  of  level  i  of  treatment  or 
factor  A,  ASj(i)  is  the  effect  of  factor  2  nested  into  factor  1,  Ck  is  the 
effect  of  level  k  of  the  third  treatment,  ACik  is  the  effect  of  the  AC 
treatment  combination  of  level  ik,  CASkj(i)  is  the  effect  of  levels  k j  (i)  of 
treatment  3  crossed  with  treatment  2  nested  in  1,  and  E(ijk)m  is  the  random 
experimental  error.  For  the  traditional  univariate  repeated  measures 
approach,  it  is  assumed  that  Ai,  Ck,  ACik,  and  \i  are  fixed  treatment  means, 
and  ASi(j),  CASkj(i),  and  E(ijk)m  are  zero  mean,  independent  normal  random 
variables  with  variances  $12,  $123/  and  $0/  respectively.  While  the  variables 
are  independent,  the  responses  are  correlated  with  the  covariance  structure. 


Cov  (Yijkm  Yi*j*k*m*)  = 


0 

if 

i*i*j*j 

012  *=  $12 

if 

i j  *=  i* 

0123  “  <*12+<*123 

if 

ijk  =  i 

<*0+0123 

if 

ijkm  = 

*,  k*k*  2.2 

j*k*,  m*m* 

* j*k*m* 


The  covariance  0t  is  between  observations  at  the  same  level  of  factors 
indexed  by  t  and  different  levels  of  all  other  factors  in  the  model.  If  we 
think  of  the  data  as  arranged  in  a  two-way  table  with  ai  times  a2  rows  and  a3 
columns,  we  see  that  $12  is  the  covariance  between  observations  in  the  same 
row  but  different  columns.  Under  the  assumption  that  there  are  no 
uncontrolled  sources  of  variability  and  the  model  and  its  assumptions  are 
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correct,  these  covariances  are  assumed  to  be  the  same  for  all  rows  and  all 
pairs  of  columns.  Theta  123  is  the  covariance  between  observations  in  the 
same  cell  and  is  assumed  to  be  the  same  for  all  cells.  If  the  experiment  is 
not  replicated,  this  estimate  is  not  estimable  and  considered  zero. 

This  suggests  examining  the  corresponding  sample  covariance.  These 
sample  covariances/  or  averages  thereof/  yield  the  estimators  of  the  0t  •  The 
sample  covariance  yielding  the  estimate  of  <(>12  is 


A 


e 


12 


Ui  a3 


r3)'1X  <r2)_1  £ 

k*k*  j 

i 


<7  .  -  Y.  )  <Y.  .  * 
i]k»  i*k«  13k* 


"Y.  *  > 

i.k. 


2.3 


in  which  ai  is  the  number  of  levels  of  factor  i,  ri  -  (ai-1) ,  and  Yijk.  is 
the  cell  mean  for  cell  (ijk) . 

* 

From  equation  2.3,  one  recognizes  the  012  estimator  as  an  average  of 
aia3r3  equal  sample  covariances  corresponding  to  all  combinations  of  k*k*,  i=l 
to  ai.  These  covariances  that  are  averaged  contain  the  diagnostic  power  in 
determining  if  the  data  contains  outliers,  there  is  uncontrolled  sources  of 
variability,  or  if  the  data  does  deviate  from  the  underlying  assumptions . 

The  distribution  theory  for  these  diagnostics  is  developed  in  Grynovicki 
(1989) .  In  this  dissertation,  the  covariance  is  written  in  matrix  notation 
which  is  referred  to  in  the  literature  as  a  bilinear  form.  For  the  above 
model,  012  can  be  written  Zi'  A  Z2  in  which 


Z  1  ^  Y  ilk  •  ,  Y i2k .  •  •  •  •  •  Y  k*  )  • 


2.4 


Z  2  =  <  Y  ilk*  •  ,  Yi2kV  •  •  •  »  Yi82ki) 

and  A  =  (Ia2  ”  Ja2  Ja2')*  is  the  identity  matrix  of  dimension  a2,  and  Ja2 

is  a  row  of  ones  of  length  a2-  The  variance  covariance  structure  of  (Zi,  Z2) 
determines  the  probability  density  function.  If  the  variance  covariance 
matrix  has  the  form 


(a(I)  c(I)\ 

c  (I)  a(I)  J  2-5 

with  a  and  c  being  linear  combinations  of  the  treatment  variances  in  the 
model,  one  can  consider  Zi  and  Z2  as  independent,  and  the  distribution  of  the 
diagnostics  is 
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oo 


(2n) 


m  1/2 


j 


(n-4)/2 


(1-(?)1/2J  (2a)(n_J)  ri 


exp(-S/ (2a) ) 


<<n-l)/2)  ! 


[- 


(nc-(pa1/2S)/a)2 
2a(l-p2  )S 


] 


dS .  2.6 


If  (Zi,  Z2)  have  a  variance-covariance  structure 


(a  (I)  +b  (JJ* ) 
c  (I)  +d(JJ’ ) 


c(I)+d(JJ») 

a(I)+d(JJ') 


) 


2.7 


one  can  consider  this  case  as  dependent,  and  the  distribution  as  derived  in 
Grynovicki  (1989)  is 


YL 


t  (2M+2i+2j  -  2)  /2  e  (t/2a  ^) 

i  j 


ij  2(M+2i+M+2j)/2ai(M+2i)/2bi(M+2j,/2r((M+2i,/2)  r«M+2j>/2) 


x  2.8 


[J  (1+„,(M*2i-2)/2]  dw. 


An  interactive  computer  program  that  calculates  the  cumulative 
distribution  of  the  variance  estimates  and  diagnostics  for  either  distribution 
has  been  written  and  is  described  in  Grynovicki  and  Green  (1990)  .  With  the 
use  of  this  program.  Army  experiments  can  determine  if  the  covariances  are 
consistent  or  abnormal  given  the  particular  experiment.  Thus,  inconsistencies 
with  the  data  or  inadequacies  with  the  model  that  may  lead  to  erroneous 
conclusions  can  be  identified  and  adjusted. 

The  program  is  written  in  Turbo-Pascal®  and  can  be  run  on  any  IBM- 
compatible  personal  computer . 


ARMY  FIELD  EXPERIMENT 

To  illustrate  how  uncontrolled  experimental  error  can  influence  the 
decision  process  and  how  the  model  diagnostics  can  reduce,  identify,  or 
control  sources  of  variability,  a  field  exercise  designed  to  determine  how 
quickly  and  accurately  soldiers  can  identify  enemy  ordnance,  using  a  prototype 
expert  system  compared  to  the  standard  60  series  technical  manual  is 
presented,  along  with  results  and  conclusions.  For  illustration,  only  the 
time  data  will  be  discussed.  It  is  worth  mentioning  that  both  systems  were 
fairly  accurate  with  AIRES  being  slightly  but  not  significantly  better 
(Grynovicki,  Miller,  and  Kras s  (1990)). 

The  U.S.  Army  Electronics  Technology  and  Devices  Laboratory  (ETDL) 
developed  a  prototype  explosive  ordnance  disposal  (EOD)  automated  information 


256 


retrieval  and  expert  system  (AIRES)  to  assist  an  EOD  team  to  disarm,  detonate, 
or  otherwise  "render  safe"  ordnance.  Traditionally,  the  EOD  team's  source  of 
technical  information  is  the  technical  manual  (TM)  €0  series  manual  which  has 
15,000  pages  of  paper  and  1,500  sheets  of  microfiche.  AIRES  transforms  a 
portion  of  the  TM  60  series  manual  into  a  portable  computer-based  system 
capable  of  assisting  the  EOD  technician  in  the  identification  and  render  safe 
procedure . 

This  proof-of-principle  prototype  system  (see  Figure  1) ,  which  acts  as 
an  intelligent  interface  between  the  user  and  a  large  complex  data  base, 
features  (a)  expert  system  software  for  information  retrieval,  (b)  an  optical 
disk  for  data  storage,  (c)  a  flat  panel  display  for  presenting  graphics  and 
text  data,  and  (d)  touch  interaction  for  operator  input. 

The  operation  of  AIRES  requires  the  user  to  enter  the  known 
characteristics  of  the  unidentified  ordnance.  The  ordnance  is  identified  when 
a  match  between  the  characteristics  entered  and  the  attributes  of  an  item  in 
the  data  base  has  been  achieved.  As  new  characteristics  are  entered,  the 
system  monitors  the  number  of  items  in  the  data  base  that  match  the  user 
input.  In  the  prototype  system,  data  are  entered  by  touching  menu  choices  and 
keypad  entries  presented  on  the  display •  When  all  known  data  have  been 
entered  or  when  a  unique  match  has  been  obtained,  the  user  is  presented  with 
an  engineering  drawing  of  the  ordnance  and  is  asked  to  confirm  the 
identification.  This  pairing  was  because  the  test  participants  (TPs)  could 
only  identify  the  same  ordnance  once. 


Text  Matrix 

A  repeated  measures  design  was  used  to  expose  each  TP  to  the  manual  and  • 
automatic  methods  of  identifying  ordnance  for  disarmament.  Groups  were  the 
between-subject  factor,  and  methods  were  the  within-subject  factor.  The 
experiment  was  counterbalanced  for  site,  method,  and  ordnance  (see  Table  1) • 

Since  the  subject  could  not  identify  the  same  ordnance  more  than  once, 
the  six  items  of  ordnance  were  paired  and  grouped  into  two  groups  of  three  by 
difficulty  (i.e.,  a  mortar  round  (a)  in  group  A  had  the  same  general 
identification  characteristics  and  difficulty  as  a  mortar  round  (f)  in  group 
B)  .  The  pairing  resulted  in  the  following  initial  grouping: 

Group  A  a  b  c 

Group  B  d  e  f 

The  ordnance  was  then  counterbalanced  to  assure  that  the  same  ordnance 
was  tested  an  equal  number  of  times  for  each  method. 


Subjective  Evaluation 

The  independent  variables  were  method  of  classification  (standard  versus 
AIRES)  group  (five  groups  of  subjects  evaluated  at  different  times),  and 
subjects.  The  dependent  variables  were  the  time  to  identify  the  ordnance  and 
number  of  correctly  identified  trials. 
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Figure  1.  Explosive  ordnance  disposal  automated  information  retrieval  and  expert  system 
(EOD  AIRES) 


Table  1 


Randomization  Scheme  and  Experimental 
Design  for  AIRES  Evaluation 


Experiment  I 
August  1988 


Group  Subject 


Method 

STANDARD  AIRES 


2 

3 

4 

5 

6 


FAB 
E  F  A 

C  D  E 

ABC 
D  E  F 

BCD 


C  D  E 

BCD 
FAB 
D  E  F 

ABC 
E  F  A 


2 


1 

2 

3 

4 

5 

6 


C  D  E 

BCD 
ABC 
FAB 
E  F  A 

D  E  F 


FAB 
E  F  A 

D  E  F 

C  D  E 

BCD 
ABC 


2 

3 

4 

5 

6 


ABC 
BCD 
C  D  E 

D  E  F 

E  F  A 

FAB 


D  E  F 

E  F  A 

FAB 
ABC 
BCD 
C  D  E 


2 

3 

4 

5 

6 


FAB 
E  F  A 

D  E  F 

C  D  E 

BCD 
ABC 


C  D  E 
B  C  D 
ABC 
FAB 
E  F  A 
D  E  F 


D  E  F 

C  D  E 

BCD 
ABC 
FAB 
E  F  A 


ABC 
FAB 
E  F  A 
D  E  F 
C  D  E 
B  D  D 


ROUNDS:  A  Model  500/1,  122mm,  fuze  model  RGM-2 

B  Model  D-832  Smoke  Mortar,'  82mm,  fuze  model  M-6 
C  Model  OF-A,  HE  Mortar,  120mm,  fuze  model  MZ-31 
D  Model  M59,  122mm,  fuze  model  M51A5 

E  Model  EBK-5M  or  ZBK-5K,  Heat,  USSR,  100mm,  fuze  model  GPV-2 
F  Model  365-K,  Fixed  Frag.,  USSR,  85mm,  fuze  model  KTM-1 
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To  try  to  control  for  subject  variability,  all  participants  had  to  be 
graduates  from  the  EOD  Advanced  Training  Course  at  the  U.S.  Army  Ordnance 
Missile  and  Munitions  Center  and  School  (USAOMMCS)  and  had  to  have  experience 
using  the  TM  60  manual.  In  addition,  all  participants  were  given  individual 
training  about  AIRES  by  the  Armament  Research,  Development,  and  Engineering 
Center  (ARDEC)  using  munitions  that  were  different  but  equal  in  difficulty  to 
the  munitions  used  in  the  field  exercise.  The  TPs  were  given  detailed 
instruction  about  entering  known  characteristics  of  the  unidentified  on  AIRES 
using  the  touching  menu  choices  and  keypad  entries.  They  were  administered  a 
pretest  trial  and  had  to  correctly  identify  fire  ordnance  before  they  were 
considered  proficient. 

Originally,  40  military  personnel  from  USAOMMCS  and  the  U.S.  Army  Forces 
Command  (FORSCOM)  were  requested,  but  only  30  TPs  were  obtained.  Based  on  the 
pool  of  subjects  that  were  available,  the  participants  in  this  experiment  had 
a  diversity  of  military  occupational  specialties  (MOSs)  from  various  armed 
forces  as  well  as  a  diversity  of  EOD  experience  as  shown  in  Table  2. 


Table  2 

Summary  of  EOD  Experience  by  Armed  Forces 


Service 

Percent  of  total 

TPs 

EOD  average 
experience 

Army 

26.7 

8 

10.5 

Navy 

20.0 

6 

8.2 

Marines 

16.7 

4 

9.5 

Air  Force 

26.6 

8 

7.6 

Unknown 

10.0 

la 

12.0 

a 

Three  additional  TPs  with  no  report  of  years  of  experience. 


RESULTS  AND  DISCUSSION 

The  results  of  the  analysis  for  identification  time  are  shown  in 
Table  3. 

The  traditional  analysis  of  variance  (ANOVA)  indicated  no  significant 
difference  in  time  between  the  two  methods.  The  average  time  to  identify 
ordnance  using  the  standard  method  was  8.41  minutes  as  compared  to  7.46 
minutes  to  identify  the  ordnance  using  AIRES.  One  reason  for  being  unable  to 
determine  a  significant  difference  between  the  classification  methods  may  have 
been  because  of  a  significant  subject-within-group-by-method  effect  which  is 
used  as  an  error  term  for  method.  Possible  experimentation  for  this  inflated 
error  term  may  be  attributable  to  uncontrolled  sources  of  variability,  and 
outliers . 
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Table  3 


ANOVA 

August  1988 
(time  in  minutes) 


Source 

DF 

SS 

F 

Significance  (.05) 

Groups 

4 

215.7 

1.91 

N.S. 

Subject  within  group 

21 

593.7 

0.94 

N.S. 

Method 

Method  by  group 

1 

4 

35.1 

97.4 

0.69 

0.48 

N.S. 

N.S. 

Method  by  subject 
within  group 

21 

1067.8 

1.69 

S. 

Error 

104 

3132.0 

E&Z:  s.  -  significant;  N.S.  -  nonsignificant 


To  obtain  insight  into  the  nature  of  this  significant  interaction 
the  underlying  linear  model,  the  variance  components  diagnostics  for  the 
random  effect^subject  within  group  «J>12)  wil1  be  evaluated  for  consistency. 
Based  on  the  underlying  assumption  of  sphericity  associated  wit  t  e  repea  e 
ao  tnrw^oi  variance  conuponents  should  he  statistically  siirtila  . 

FS” 

in  Figure  2. 


e 

12 


AIRES  Method 
Group 

1  2  3  4  5 


Figure  2.  Variance  component  diagnostics  for  the  random  effect  subject- 
within-group  (Experiment  I) . 
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In  rewriting  these  covariances  as  a  bilinear  form,  it  can  be  shown  that 
the  variance-covariance  matrix  of  the  two  response  vectors  exhibits  the 
independent  form  as  outlined  in  equation  2.5  and  therefore,  the  distribution 
of  the  diagnostics  follows  equation  2.6. 

A  computer  program  to  calculate  the  cumulative  distribution  of  the 
variance  component  diagnostics  is  presented  in  Grynovicki  (1989) .  The  program 
is  written  in  Turbo-Pascal  Version  5.0®,  and  can  be  compiled  and  run  on  any 
IBM  personal  computer,  provided  Turbo-Pascal  5.0  is  available. 

The  program  uses  Simpson's  integration  method  and  requires  calculation 
of  the  modified  Bessel  function  for  both  integer  and  fractional  order. 

When  the  program  is  run,  a  menu  appears  that  prompts  the  user  about 
whether  the  diagnostic  being  investigated  should  be  considered  the  dependent 
or  independent  case.  Once  the  distribution  is  selected,  the  user  is  asked  to 
enter  the  standard  deviations  of  each  response  vector 


Y  * 

13k®  and  ijk* 


as  well  as  the  covariance  between  the  selected  pairs .  The  program  also  asked 
for  the  sample  size  and  range  of  integration. 

For  the  diagnostics  for  $12 t  the  variance  was  estimated  as  11.2  and 
covariance  as  -0.523  based  on  the  method  of  variance  component  estimate  found 
in  Hocking,  Green,  and  Brenner  (1990)  .  The  sample  size  used  was  six  which  is 
conservative,  and  range  was  decided  interactively  until  the  cumulative 
distribution  reached  one. 

The  5th  percentile  for  the  diagnostic  distribution  was  estimated  as 
-23.12,  and  the  95th  percentile  associated  with  a  diagnostic  had  a  value  of 
3.73.  Thus,  -42.29  and  13.22  fall  outside  this  95th  percentile  confidence 
interval  and  indicate  a  problem  with  the  underlying  linear  model  and  its 
assumptions . 

In  investigating  this  unexplained  square  of  variability,  the  average 
time  to  identify  the  three  ordnances  was  examined  by  the  demographics  of  the 
military  personnel.  As  shown  in  Table  4,  the  Army  military  personnel 
identified  the  ordnance  slightly  quicker  than  military  personnel  from  other 
armed  forces.  In  addition,  experience  seemed  to  influence  identification 
time.  Military  personnel  with  more  than  5  years*  experience  identified 
ordnance  more  quickly  with  both  methods  than  military  with  5  years'  experience 
or  less  (see  Table  5)  .  For  the  less  experienced  group,  identification  time 
was  decreased  an  average  of  3.3  minutes  using  AIRES  versus  the  manual  system, 
while  the  experienced  group  had  a  decrease  of  only  1.1  minutes.  Thus,  AIRES 
seems  to  benefit  the  less  experienced  soldier  more.  To  investigate  these 
trends,  a  second  experiment  was  conducted  using  soldiers  with  less  than  5 
years'  experience. 
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Table  4 


Average  Identification  Time  of  Three  Ordnance  by  Armed  Forces 


Armed  Forces 

Manual 

AIRES 

Army 

23.5  minutes 

22 . 0  minutes 

Othera 

24.7  minutes 

23.0  minutes 

aOthers  include  Air  Force,  Navy,  and  Marine  Corps  personnel. 


Table  5 

% 

Average  Identification  Time  of  Three  Ordnance  by  EOD  Experience 


Experience 

Number  of  Subjects 

Manual 

AIRES 

1  to  5  years 

6 

26.8  minutes 

23.5  minutes 

More  than  5  years 

21 

23.8  minutes 

22.7  minutes 

Experiment  II 

Speculating  that  the  inflated  error  term  for  method  may  be  because  of 
the  wide  range  of  EOD  experience  and  service  background  of  the  subjects,  it 
was  decided  that  a  second  field  experiment  with  TPs  having  an  equal  amount  of 
EOD  experience  and  service  background  (all  Army)  be  conducted.  All 
participants  had  2  years  6  months  or  less  EOD  experience  as  summarized  in 
Table  6 .  A  repeated  measures  design  was  also  chosen  for  this  experiment  with 
two  groups  of  six  subjects  each.  Time  was  the  dependent  measure.  Only  11 
subjects  successfully  completed  the  field  trial.  The  assumption  of  sphericity 
using  Box's  M  could  not  be  rejected.  The  diagnostics  for  $12  for  the  second 
experiment  is  shown  in  Figure  3.  The  variance  for  the  response  vectors  was 
estimated  using  the  method  previously  discussed  as  17.91  with  a  correlation  of 
0.323.  The  diagnostic  was  determined  to  fall  under  the  independent 
distribution.  The  sample  size  of  six  was  used.  The  95th  percentile 
confidence  interval  was  calculated  to  be  between  and  including  -6.68  and 
21.08.  As  seen  from  Figure  3,  the  data  appear  consistent,  and  the  linear 
model  and  underlying  assumption  of  sphericity  appear  to  hold;  the  assumption 
of  sphericity  using  Box's  M  statistic  could  not  be  rejected. 
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Standard  Method 
Group 


©  Q. 

S* 

cc 


Figure  3.  Variance  component  diagnostics  for  the  random  effect  subject 
within-group  (Experiment  II)  . 


Table  6 

Subjects'  Training  and  Experience 
Experiment  II 
November  1988 


Group 


Subject 


Service  and  MOS 


EOD  experience 
(years  and  months) 


USA/55D10 

USA/55D10 

USA/55D10 

USA/55D10 

USA/55D10 

USA/55D10 

USA/55D10 

USA/55D10 

USA/55D10 

USA/55D10 


A  univariate  analysis  for  repeated  measures  was  then  performed.  The 
results  are  shown  in  Table  7. 


Table  7 


ANOVA 

Experiment  II 
November  1988 
(time  in  minutes) 


Source 

DF 

ss 

F 

Significance  (.05) 

Groups 

1 

29.99 

0.42 

N.S. 

Subject  within  group 

9 

640.85 

1.95 

N.S. 

Method 

1 

351.82 

9.70 

S. 

Method  by  group 

1 

46.94 

1.29 

N.S. 

Method  by  subject 

9 

326.13 

1.00 

N.S. 

within  group 

Error 

43 

1568.50 

Scheff6's  Test 

Scheff6  Grouping  Mean 

A 

10.85 

STANDARD 

B 

6.19 

AIRES 

Note.  S.  «*  significant; 

N.S 

.  «  nonsignificant 

The  F  variable  for  method  (F  [1,9]  =  9.70)  indicated  that  there  were 
statistically  significant  differences  between  the  automatic  and  manual  methods 
of  identification.  A  soldier  using  the  standard  method  took  almost  11  minutes 
to  identify  the  ammunition,  while  using  the  AIRES  method  decreased  the  average 
time  by  4.6  minutes  to  6.18  minutes.  There  were  no  significant  differences 
between  soldiers  or  interactions  regarding  time  of  identification.  All 
soldiers  but  one  identified  the  ordnance  faster  using  AIRES  as  shown  in  Table 
8. 


Thus,  there  was  a  significant  identification  time  difference  between  the 
automatic  (AIRES)  and  manual  (standard)  methods  when  soldiers  with  limited 
experience  and  the  same  MOS  (Experiment  II)  were  evaluated.  Soldiers  in  this 
group  identified  ordnance  significantly  faster  using  AIRES. 
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Table  8 


Experiment  11 
November  1988 

Mean  Completion  Time  by  Group,  Subject,  and  Method 

(minutes) 


Group 

Subject 

AIRES 

STANDARD 

i 

1 

5.67 

13.67 

2 

3.67 

5.00 

3 

9.50  • 

12.67 

4 

4.67 

7.33 

5 

7.00 

24.00 

6 

— 

— 

2 

1 

7 . 67 

11.67 

2 

5.33 

12.00 

3 

6.67 

13.33 

4 

5.33 

6.33 

5 

4.67 

13.67 

6 

9.00 

10.00 

CONCLUSIONS 

This  paper  has  demonstrated  that  uncontrolled  sources  of  variability 
during  experimentation,  attributable  to  difference  in  experience,  age,  MOS, 
and  Armed  Forces,  mask  significant  results  and  hinder  decision  makers  from 
drawing  the  correct  conclusions  concerning  the  explosive  ordnance  disposal 
automated  information  retrieval  and  expert  system.  Model-based  diagnostic 
procedures  have  been  demonstrated  to  be  effective  in  assessing  the  data  and 
experimental  model,  and  indicating  probable  causes  for  the  violation  of  the 
model  assumptions .  Through  these  diagnostic  procedures,  the  researcher  was 
able  to  control  additional  sources  of  variability  so  that  the  data  conformed 
to  the  standard  assumption  of  compound  symmetry.  Thus,  the  researcher  was 
able  to  conclude  that  there  was  a  significant  identification  time  difference 
between  the  automatic  (AIRES)  and  manual  (standard)  method  when  soldiers  with 
limited  and  the  same  MOS  were  evaluated.  Soldiers  in  this  group  identified 
ordnance  significantly  faster  using  AIRES. 
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MODELING  GUNFIRE: 

A  PROBLEM  IN  MAXIMUM  LIKELIHOOD  ESTIMATION 

Henry  B.  Tingey 


Statement  of  the  Problem. 


* 


Figure  1.  Schematic 


7T 

2 


A  gun  at  0  fires  at  a  wall  d  units  away.  The  angle  of  fire,  <f>,  that  the  gun  makes 
with  OM  is  a  random  variable  having  the  distribution  function  given  by 

{0  <f>  <  — 7r/2 

-,/2-<*Or/2  '  (1) 

1  4  >  */2 


Suppose  we  can  observe  only  the  hit  made  on  the  wall.  Let  x  be  the  distance 
measured  from  M  to  a  hit.  From  a  random  sample  X\,X2, . .  • ,  xn  of  such  hits:  i) 
Determine  the  maximum  likelihood  estimator  of  a,  d,  ii)  Find  the  distribution  of  d, 
iii)  When  a  —  1  what  is  the  distribution  of  r,  the  distance  form  M  to  a  hit?  iv)  Set 
up  a  uniformly  most  powerful  test  for  the  hypothesis,  H  :  a  =  a0  with  alternatives 
a  <  c*o  for  an  arbitrary  significance  level. 

a)  Given  the  sample  of  hits,  -1.306  d,  4.921  d,  2.865  d,  -6.512  d,  —3.984  d,  -0.782 
d,  3.542  d,  5.552  d,  —4.375  d,  —2.107  d.  Test  Ho  :  a  =  1  versus  Hi  :  a  <  1  at 
the  .05  level  of  significance. 

b)  Plot  the  power  function  of  the  test  Ho  :  a  =  1  against  the  alternative  a  <  1  at 
the  .05  significance  level  using  the  points  a  =  7/4,  (1/4),  1/4. 

Determination  of  the  Maximum  Likelihood  Estimator. 

The  principle  of  maximum  likelihood  estimation  is: 
is  the  density  for  a  random  sample  of  size  n  drawn  from  a  population  with  an 
unknown  parameter  6 ,  then  the  maximum  likelihood  of  6  is  the  value,  6,  if  it  exists, 
such  that  fxi,...xn(X i, . . .  ,Xn;& )  >  fxlt...,xn(X i,.. .  ,Xn]6')  where  $'  is  any  other 
value  of  6,  Uf(xi-,e)  =  L{6\lxi,...,xn)  =  L{6). 
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The  function,  Trf(xi,9),  the  likelihood  function,  is  a  function  of  6  when  the  x, 
are  known.  When  this  is  true  the  function  is  regarded  as  a  likelihood  function  of 
6  and  the  maximum  likelihood  estimate  of  8  is  therefore  that  value  of  8  for  which 
maximizes  nf(xi,6 )  as  a  function  of  9. 

The  procedure  is  as  follows:  We  are  given  the  distribution  function  (cdf)  of 
which  the  first  derivative  with  respect  to  the  random  variable  is  the  density. 

Recall  then,  (1)  and  for  simplicity  let 

z  = - —  . 

7T 

That  is; 


\  °* 

z<  o 

G(z)  = 

A 

0<  Z<1  . 

.  (2) 

l  1, 

z  >  1 

G'(z)  - 

=  g(z, 

Of)  =  aza~1 

(3) 

where  g(z,  a)  is  defined  over  0  <  z  <  1  and  is  zero  otherwise. 
Consider  now  the  likelihood  function, 


l(q)  =  II  9(zi,a) 


1=1 


Following  the  customary  procedure  of  finding  the  maximum  of  the  logarithm 
of  the  likelihood  function  rather  than  of  the  likelihood  function  itself,  maximize 


L(9)  —  In  L{8 )  —  In  g(zi,a)  —  n  In  a  +  (a  —  1)  In  z,- . 

»=i  «=i 

Then 

m*  =  -  +  £ln*, 

and  setting  L'(9)*  =  0,  we  obtain, 


n 

E  l°g  2*  ’ 
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The  Distribution  of  a. 

The  distribution  of  a  may  be  determined  exactly  by  two  different  approaches 
and  approximately  in  at  least  one  way.  The  two  methods  of  finding  the  exact 
distribution  are,  i)  consider  a  change  of  variable  in  the  density  function  and  ii) 
by  means  of  moment  generating  function.  The  approximate  method  consists  of 
considering  the  normal  approximation 

i)  Recall  (3),  let  y  =  —  log  z.  Then 

h(y)  dy  =  ae~ay  dy  0  <  y  <  oo .  (4) 

It  is  easily  verified  that  the  integral  of  (4)  is  1.  Consider  the  transformation 
ay  =  —u/2.  Again  by  substitution  into  (4)  yields, 

«(“)  du  =  \  e~“/2  du 


and 

Q(V)  =  [  \  e-"/!  du  (5) 

describes  the  x2  distribution  with  two  degrees  of  freedom. 

Hence  log  z  is  distributed  as  x2  with  two  degrees  of  freedom.  Therefore  1/d  is 
distributed  as  x2  with  2 n  degrees  of  freedom  since  it  is  the  sum  of  n  independent 
X2  variables.1 


ii)  Recall: 


Let 


a  = 


n 

X)  In  Zi 


0  <  z  <  1 . 


X)  ln  *i 

n 


=  y 


a  =  1/y. 


The  moment  generating  function  is  defined  by  my(t)  =  E[etY]  with  f  as  a  pa¬ 
rameter,  t  an.  From  the  sample  z\ , . . . ,  z„  of  n  independently  and  identically 
distributed  variates  we  may  make  the  following  calculations,  by  substitution 
E[etY  ] 

{E  [c-<*/»)i°g  *]}«, 

since  the  distribution  of  any  one  z  describes  the  distribution  of  all  the  others. 
Hence  the  result  here 

=  {£[2"t/n]}n . 


1  The  exact  distribution  of  a  is  considered  in  Appendix  I. 
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But 


{£[z 


-  [/' 


t/,n  az°  1  dz 

n 

CV2ra  (V71) 

r 

. 

a  —  (t/n) 

0 

Therefore 


mz(t)  = 


[1  -  ( t/an ) 


(6) 


This  is  the  moment  generating  function  for  a  gamma  distribution  with  param¬ 
eters,  0  =  o'  =  n  -  1. 


By  virtue  of  the  uniqueness  theorem:  If  two  random  variables  have  the  same 
generating  function,  they  have  the  same  density  function,  thus,  an  attempt 
must  be  made  to  identify  this  density  function  with  a  known  density.  Otherwise, 
we  go  on  to  the  next  problem. 


Consider  k(u)  = 


(n-i)! 


7t  w"-1  e-umQf 


w  >  0  . 


Let  u  =  i  then  t  (A)  =  (§)"'’  f,  a  >  0  . 

Let  u  =  2n  a /a  j(u )  =  un_1  e~u/2  u  >  0  .  This  is  a  xl  distribu-. 

tion,  when  n  =  1. 


iii)  The  approximate  normal  distribution  of  the  estimator,  a,  for  large  samples  is 
given  by: 


$(&) 


1 

V2^ 


e  2J5 


where 


-2  _  2 
a  = 


~nE  loe  six,*)} 


log,  0(2,  a)  =  log  a  +  (o  -  1)  log  z 


d2  log  <?(z,o) 
do2 


Then 


a2  = 


a4 


o4 


~n  (“£)  n 
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The  Case  When  a  =  1. 


Returning  to  the  original  density  function, 

f{<t>)  =  -  tt/2  <  <j>  <  tt/2  . 


If  a  =  1  it  is  then  readily  apparent  that  =  k  and  k  =  ^  has  the  uniform 
density. 

The  distribution  of  x  will  be  given  by  the  expression 

9(x)  =  h[f(x)]\h'[f(x)}\. 

From  the  physical  situation,  let  x  =  d  tan  <f>. 


h'(*) 


1 

1  +  x2  ‘ 


Hence  the  distribution  of  x  is  given  by 

9(x)  =  \  -  oc  <  z  <  oc  (7) 

)r  i  +  r 

or  the  familiar  normalized  Cauchy  distribution.2 

The  Uniformly  Most  Powerful  Test. 

First  the  definition 'of  a  uniformly  most  powerful  test.  A  uniformly  most  pow¬ 
erful  test  is  defined  as  a  test  in  which  all  alternatives  give  rise  to  the  same  critical 
region.  That  is: 

f(x  :  <{>)  >  kf(x  :  4>0 ) 

for  any  particular  value  of  4>  and  all  possible  4>q.  This  situation  is,  in  general,  not 
true.  However,  for  the  common  distributions  in  statistics  it  is.3 

A  natural  procedure,  in  the  context  of  this  paper  is  then  to  try  the  likelihood 
ratio  test.  For  the  situation  here  let  Zi, ...  ,zn  be  a  sample  size  of  n  from  a  popula¬ 
tion  with  density  g(z,  a).  We  wish  to  test  the  null  hypothesis,  Ho  :  g(z,  a)  belongs 
to  w  a  subspace  of  fi.  The  likelihood  of  the  sample  is: 

L=-  g(zi,a). 

7 r 


2  Referred  to  in  Box  Hunter,  Hunter  Statistics  for  Researchers  as  a  Mathematicians  Toy,  p. 
69. 

3  Further  discussion  of  the  uniformly  most  powerful  test  is  found  in  Appendix  II. 
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Ordinarily  the  likelihood  as  a  function  of  the  parameters  will  have  a  maximum 
as  the  parameters  are  allowed  to  vary  over  the  whole  parameter  space,  Cl.  The 
maximum  in  lo  will  occur  in  a  like  manner  except  here  the  parameters  vary  over  a 

subspace  of  Cl.  L(a),  the  maximum  value,  will  be  called  L(Cl)  in  the  whole  space. 
L(a0),  the  maximum  value,  will  be  called  L(u)  in  the  subspace.  The  ratio  of  these 
two  maxima  thus  form  the  likelihood  ratio  which  is  denoted  by  A. 

This  fraction  has  the  following  obvious  properties;  it  is  positive  since  L  is  a 

product  of  positive  density  functions,  L(Cj)  is  always  less  than  or  equal  to  L( Cl) 
because  of  fewer  degrees  of  freedom  in  u>.  Also  here  it  is  a  ratio  of  similar  shaped 
monotonic  functions  that  differ,  at  most,  in  location.  Thus,  the  A  ratio  is  a  mono¬ 
tonic  function  of  sample  observations  only  and  has  range  zero  to  one. 

Suppose  for  the  distribution  here  we  wish  to  test  the  null  hypothesis 

Ho  :  a  =  <*0  • 

This  point  is  then  u>  while  the  whole  a  axis  is  Cl.  The  likelihood  is 


i  =  II  • 

1=1 

The  maximum  value  of  the  likelihood  is  will  be  given  by  setting  a  =  a  to  obtain 

m) = n 

t=i 


and  in 


The  likelihood  ratio  is 


Hu,) = n  oo  . 


i= 1 


a  =?  n  <*o/“  zi 

t=i 

Taking  the  log  of  both  sides  we  obtain 
log  A 


A 


n 


=  log  ao/a  —  ao/a  +  1  where 


X)  log  Zi 


n 


1 

a 


Consider  the  limiting  cases  for  e*o  and  d:4 


4  Here  n  will  be  fixed  and  have  no  real  effect  at  the  limit. 


274 


For  <xq  — ►  0  lim  log  +  1  — >  —oo . 

ao— fO  OL  Oi 

For  oco  =  a  log  ^  +  1  =  0. 

For  o  — ►  0  lim  log  ^  +  1  — ►  — oo.5 

ct — ►O  01  u 

Also  it  is  readily  apparent  that  log  A  is  monotone  since  all  the  values  involved 
in  the  function  are  always  positive  or  zero  and  monotone,  as  the  ratio  of  two  similar 
monotonic  functions. 

Using  A  criterion  as  a  test  would  yield  a  very  clumsy  procedure.  An  actual 
sampling  distribution  for  A  would  indeed  be  difficult  to  find.  We  then  have  two 
resources, 

i)  The  large  sample  distribution  of  A  which  states  that  under  certain  conditions, 
—2  log  A  is  distributed  approximately  as  x2  with  degrees  of  freedom  equal  to 
the  dimensionality  of  ft,  say  k,  less  the  dimensionality  of  w,  say  r,  if  r  <  k, 
and  Ho  is  true. 

ii)  We  may  recall  the  result  of  the  distribution  of  the  reciprocal  of  the  estimator, 
a,  as  determined  via  the  moment  generating  function  method. 

Here  we  now  see  the  quantity  2n  ^  which  is  distributed  as  x2  with  2n  degrees  of 
freedom  is  everywhere  greater  than  the  A  criterion  and  has  the  desirable  properties 
of  the  x2  distribution.  We  need  only  specify  an  ‘n’  and  a  uniformly  most  powerful 
test  will  be  determined.  Further,  2n  ^  cannot  have  fewer  than  2  degrees  of  freedom 
while  the  large  sample  A  has  only  1,  no  matter  what  the  sample  size.  The  nature 
of  the  chi  square  distribution  has  been  shown  to  give  uniformly  more  power  as  the 
degrees  of  freedom  increase. 

Because  the  large  sample  approximation  still  has  the  inherent  awkward  calcu¬ 
lation  involved,  we  use  the  test  criterion, 

T  =  =  x*n  under  H0  (8) 

a 


a)  For  the  sample  given  earlier  the  hypothesis  is  tested  by 

Test  procedure: 

i)  Ho  :  a  =  1 

ii)  HA:  a  <  1 

iii)  Significance  level  =  .05 

iv)  T  =  2n  ££  x2  with  2n  degrees  of  freedom  under  Ho 

v)  Here  n  =  10  so  we  have  20  degrees  of  freedom 

vi)  Test  Rule:  Reject  H0  if  T  >  xlo  -05  =  31-4 

vii)  Calculation  of  T 

. . . i . . . . ii. 

5  This  result  can  be  easily  verified  from  elementary  calculus. 
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Recall 


*--/£*  (^) 

and  x/d  =  tan  <f>  <j>  —  arc  tan  x/d 

For  our  sample 


x 

<j)  (nearest  minute)6 

Radians 

(*+0 

l°g  (*  +  § 

-  1.306c? 

-  52°  34' 

-0.91746 

.208 

-1.57022 

4.921d 

78°31' 

•  1.37037 

.936 

-0.06614 

2.865d 

70°46' 

1.23512 

.893 

-0.11317 

—  6.512c? 

-81°  16' 

- 1.41838 

.049 

-3.01593 

—  3.984<f 

-  75°55' 

-1.32499 

.078 

-2,55105 

-0.782 d 

-  38°02' 

-  0.66381 

.289 

-1.24133 

3.542d 

74° 14' 

1.29561 

.412 

-0.09212 

5.552<? 

79°48' 

1.39218 

.943 

-0.05869 

—  4.375c? 

-  77°  08' 

- 1.34673 

.071 

-2.64508 

-  2.071d 

-64°  13' 

- 1.12079 

.143 

-1.94491 

Then  £ 

log  ( j  _  -13.29864  and 

10 

13.29684 


.75196. 


So  for  T  we  obtain 


T  _  2(10X1)  _  „  „ 

r-T75i9r-26-60- 


Since  T  =  26.6  <  X20-95  =  31.4  we  do  not  reject  H0. 

The  Power  Function. 

For  the  case  here,  when  we  are  testing  the  hypothesis,  Ho  :  a  =  c*o  versus  all 
alternatives  Ha  :  a  <  c*o,  the  power  function  will  be  given  by 


f(x)  dx  , 


where  f(x)  is  the  appropriate  \2  density.  This  will  be  a  function  of  a.  Then  the 
true  a  is  far  to  the  left  of  »o  the  power  will  be  near  1.  As  a  approaches  <*o  the 
power  approaches  the  significance  level.  As  a  goes  to  the  right  the  power  decreases 
to  zero  for  this  alternative. 
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For  purposes  of  graphing  the  power  function,  consider  the  following  argument. 


Power  =Pr  {T  >  X20O95)} 

where  T  is  our  test  statistic  and  xloC9^)  is  the 
critical  value  of  the  x2  distribution  or 

Pr{^i>xl°(-95)}  =  1-^. 

Call  the  true  value  of  a,  ax  and  multiply  both  sides  of  the  inequality  by  —£• 
obtaining: 

{2221  >  SI  A  (.95)}=  1-fi. 

L  ot  <*0  J 

•  Then,  since  is  an  exact  x2  variate  we  have 

Pr  jxL  >  ^  xio(-95)|  =  Power 


Olt 


f^xlo(-95)  ^{xL>Sxlo(-95)} 


54.95 

0.0 

47.10 

.0005 

39.25 

.0033 

31.40 

.05 

23.55 

.27 

15.70 

.74 

7.85 

.9925 
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Power  Function  for 


* 
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This  is  the  uniformly  most  powerful  test  for  a  given  alternative.  Also  it  is  more 
powerful  that  the  A  criterion  as  will  be  shown  later.  As  n  increases  the  test  becomes 
more  sensitive.  As  n  decreases  so  does  the  power  of  the  test. 

As  a  final  note  to  this  paper  it  may  be  of  interest  to  repeat  the  above  test 
procedure  for  the  large  sample  distribution  of  the  A  criterion. 

Under  this  procedure  —2  log  A  (=)  Xi  under  Ho  . 

Reject  here  if  —2  log  A  >  x?(-95)  =  3.84 . 

Calculation: 

—2  log  =  —40  (tag 


+1) 


.75196  .76196 

=  -40(. 28518  - 1.33  +  1) 

=  1.7928 

Therefore  do  not  reject  Hq  since  1.7928  <  3.84. 

A  look  at  the  power  of  the  test  in  comparison  to  the  above  power  function  may 
take  the  following  argument. 

In  order  to  satisfy  the  condition  of  a  uniformly  most  powerful  test  for  the  test 
criterion  2n  ^  consider  the  following: 


We  need  to  show  for 


^>A 

a 


that  Ty \n  as  the  test  statistic  for  the  exact  sampling  distribution  of 
and  T\  as  the  statistic  for  the  appropriate  distribution  of  —2  tag  A  have  the 
relationship  T\  >  Txs  or  where 


f°r  :  If  >  A 

and  T\  :  ^  >  A' 
a 


It  is  sufficient  to  show  that  A1  >  A  to  conclude  that  Txt^  is  a  uniformly  more 

powerful  criterion  than  T\.  It  will  be  necessary  to  establish  the  inequality  A'  >  A 
for  every  A  and  A! . 


them 


Consider  first  the  critical  values  and  equate  the  respective  test  statistics  to 


T  =2nao=314 

A-2  n  qj 
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here  a  <  .6394  is  sufficient  for  rejection  and  ^  =  1.5700 
T\  =  -2 n  (log  ^  ^  +  1 )  =  3.84 

here  a  <  .5700  for  rejection  and  =  1.7547 

a 

Then  Af  >  A  and  clearly  since  the  chi  square  distribution  is  monotonic  and  also 
has  a  greater  rate  of  change  as  the  degrees  of  freedom  are  increased  the  inequality 
will  therefore  hold  for  every  A!  and  A  it  follows  that  Tx 2  is  the  uniformly  most 
powerful  test  criterion. 


The  solution  here  is  iterative.  Usually  4  or  5  tries  to  give  accuracy  to  the  3rd  decimal  place. 
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Appendix  I 


The  Exact  Distribution  of  d. 


After  performing  the  substitution  in  Part  II,  it  was  found  that: 

<f>(y)  dy  =  aeQy  dy 

which  defines  the  distribution  of  the  reciprocal  of  the  estimator,  d. 

The  distribution  of  d  then  is  actually  the  inverse  of  a  chi  square  distribution 
with  two  degrees  of  freedom. 

It  is  possible  to  find  the  form  of  the  distribution  of  d  then  by  substitution  into 
the  above  formula. 


Let  x  =  1/y 
then  dx  =  —  p-  dy 


thus  y  =  \  is  the  inverse  of  the  above  transformation  and  it  follows  that 
h(x)  dx  =  x~2  dx  .  From  the  relationship 

9{x)  =  f[Kx )]  M*)l 

we  obtain 

ax~2  e~atx  dx . 

If  we  let 

x  =  d 
dx  =  da 


then 


4  e*/&  da 


cr 


describes  the  distribution  of  the  maximum  likelihood  estimator,  d. 


Here  the  distribution  of  d  is  not  readily  recognizable.  It  may  be  possible  to 
identify  this  distribution  via  Pearson’s  Principle  of  Moments.  However,  if  this  were 
tried,  one  is  still  faced  with  getting  to  a  convenient  form  of  the  distribution.  Since 
it  has  already  been  shown  that  the  distribution  of  1/d  is  readily  recognizable  and 
has  the  desirable  properties  of  a  “good”  test,  it  seems  futile  to  pursue  this  approach 
any  further. 
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Appendix  II 

Mathematical  Proof  of  a  Uniformly  Most  Powerful  Test. 

For  a  particular  value  of  a1  of  a  the  best  test  of  Qo  against  a1  is  given  by 
choosing  as  a  critical  region  the  set  of  points  for  which 

f(z-,a')  >  kf(z; o0) 

or  for  this  particular  distribution 


a'z "'-1  >  kao  za°-1 

zQ'~a°  >k  — 
a' 


log  2  > 


log  k  +  log  sp¬ 
ot'  -  a0 


Hence  we  have  as  the  best  critical  region  an  interval  log  z  >  A.  Since  it  is 
known  that  log  z  is  distributed  as  chi  square  with  two  degrees  of  freedom,  A  is  to 
be  chosen  so  that 


/; 


fix df  dx2  =  P  {Type  I  error}  . 


The  value  A  is  the  abscissa  for  x\  distribution  such  that  if  f(x)  is  the  x\ 
distribution  then 


For  example:  For 


/OO 

fix )  dx  =  P  {Type  I  error} . 

P  {type  I  error}  =  .05 
A  =  5.99 


This  derivation  is  on  the  basis  of  a  single  observation.  The  generalization  to 
samples  of  size  n  is  immediate.  The  observation  (zi, . . .  ,zn)  may  be  plotted  in  n 
dimensional  space  which  may  be  divided  into  two  regions,  rejection  and  acceptance. 
The  ratio  of  the  products  of  the  two  density  functions  will  yield  a  critical  region 
which  will  be  defined  by  a  chi  square  value  with  2 n  degrees  of  freedom  at  the  level 
of  the  Type  I  error. 
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RESPONSE  SURFACE  METHODOLOGY  FOR  VALUE  ADDED  ANALYSIS 

Captain  William  F.  Mann  III  and  LTC  Andrew  G.  Loerch 
Force  Systems  Directorate  -  Resources  Analysis 
U.S.  Army  Concepts  Analysis  Agency 
8120  Woodmont  Avenue 
Bethesda,  Maryland  201814-2797 

ABSTRACT.  The  Value.  Added  Analysis  (VAA)  study  at  the  U.S.Army 
Concepts  Analysis  Agency  will  develop  the  capability  to  perform  cost-benefit 
analysis  of  major  item  systems  in  support  of  the  Army  Planning, 
Programming,  Budgeting,  and  Execution  System  (PPBES).  A  sub-problem 
of  the  study  is  the  analysis  of  the  contribution  of  the  systems  under 
consideration  to  different  measures  of  effectiveness  in  a  combat  model. 
Modified  Plackett-Burman  designs  are  used  to  create  general  linear 
formulas  of  the  form: 

Y  =  bo  +  bixi  +  b2X2  +  b3X3  +  . . .  +  brXr . 

While  linear  models  are  usually  used  for  fitting  a  regression  line  along  a 
range  of  variables,  they  can  also  be  used  with  qualitative  variables.  The 
estimator  bo  is  a  collective  measure  of  the  worth  of  the  base  Case  weapons. 
The  estimators  bi  through  br  represent  the  contribution  of  individual 
systems  to  the  improvement  of  the  Corps-level  measures  of  effectiveness 
(MOEs);  e.g.,  Loss  Exchange  Ratio,  Fractional  Exchange  Ratio,  Effective 
Battalions  Remaining,  etc. 

There  are  several  advantages  of  this  method  over  previous  practices. 

First  a  weapon's  "additive"  effect  can  be  estimated.  This  "additive"  effect 
can  be  used  to  determine  the  "benefit"  of  buying  this  system.  Secondly,  the 
general  linear  model  can  be  used  to  evaluate  different  force  packages  or 
mixes  without  rerunning  the  combat  model. 


1.  INTRODUCTION.  The  Value  Added  Analysis  (VAA)  methodology 
was  developed  by  the  U.S.  Army  Concepts  Analysis  Agency  to  provide  the 
Army  Staff  with  a  rapid  response  analysis  framework  for  performing  cost- 
benefit  analysis  to  compare  competing  investment  alternatives  during 
development  of  the  Army  Program.  An  important  aspect  of  this 
methodology  is  the  evaluation  of  major  item  systems  performance  using  a 
combat  simulation. 

In  the  present  environment  of  shrinking  budgets  and  no  dominant 
theater,  program  development  has  become  increasingly  difficult.  Many 
competing  systems  must  be  evaluated  in  several  theaters  across  several 
years.  The  number  of  simulation  runs  is  potentially  huge.  The  task,  then, 
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is  to  keep  the  number  of  runs  at  a  minimum  while  extracting  as  much 
information  as  possible  from  each  one. 

The  purpose  of  this  paper  is  to  describe  how  the  technique  of  response 
surface  methodology  (RSM)  and  the  general  linear  model  (GLM)  can  be 
used  to  accomplish  this  task.  The  use  of  these  two  statistical  techniques 
allows  the  systematic  development  of  the  set  of  required  simulations  and  its 
inputs,  and  facilitates  the  analysis  of  the  output  as  well. 


2.  COMBAT  MODELING.  The  Corps  Battle  Analyzer  (CORBAN)  is  a 
complex  combat  simulation  in  which  combat  functions  are  represented 
mathematically.  VAA  requires  the  evaluation  and  quantification  of  certain 
combat  and  non-combat  systems.  It  is  also  necessary  to  represent  the 
systems  in  several  different  theaters  and  across  a  wide  range  .of  years.  To 
get  a  broad  spectrum  of  possible  situations,  the  systems  are  evaluated  in 
three  different  and  plausible  scenarios  for  three  different  years  (1996,  2001, 
and  2008).  For  each  of  these  nine  scenario/year  combinations,  VAA 
requires  approximately  50  CORBAN  runs.  We  are  interested  in  several 
measures  of  effectiveness  (MOEs)  derived  from  the  model.  Some  examples 
of  these  MOEs  are: 

a.  Loss  Exchange  Ratio  (LER)  -  This  MOE  is  a  ratio  of  the  number  of 
major  Red  systems  lost  divided  by  the  number  of  Blue  systems  lost.  LER  is 
computed  as  follows: 


LER  =  ikd  Systems  Lost 
Blue  Systems  Lost 

b.  Fractional  Exchange  Ratio  (FER)  -  This  MOE  measures  the 
fractional  red  losses  compared  to  fractional  blue  forces.  FER  is  computed 
as  follows: 


Red  Systems  Losses/ 

PE  £  _  ' _ /Red  Systems  Started 

Blue  Sytems  Losses / 

/Blue  Systems  Started 

c.  Red  and  Blue  Effective  Battalions  Remaining  (EBR)  -  This  MOE  is  a 
measure  of  the  number  of  battalions  (generally  maneuver,  artillery,  rocket 
and  helicopter)  remaining  on  each  side  that  are  still  combat  effective  at  the 
conclusion  of  the  simulated  conflict. 

d.  Red  and  Blue  Movement  of  Force  Center  of  Mass  (MFCM)  -  This 
MOE  measures  of  the  performance  of  an  attacker  by  examining  the 
distance  the  center  of  mass  of  his  forces  has  travelled. 

The  inputs  to  CORBAN  include  scenario,  terrain,  representative  Red 
forces  and  Blue  forces,  missions,  and  orders.  The  Base  Case  asset  list 
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contains  current  systems  that  will  be  available  during  a  particular  year  of 
interest.  Excursions  are  developed  by  substituting  base  case  systems  with 
new  weapon  systems,  or  adding  new  systems,  to  produce  new  results  that 
can  be  compared  with  the  base  case  results. 

3.  THE  STATISTICAL  MODEL.  If  one  imagines  that  the  CORBAN 
model  is  a  black  box  (see  Figure  1),  the  use  of  response  surface  methodology 
becomes  easier  to  understand.  "Response  surface  methodology  comprises  a 
group  of  statistical  techniques  for  empirical  model  building  and  model 
exploitation.  By  careful  design  and  analysis  of  experiments,  it  seeks  to 
relate  a  'response',  or  output  variable  to  the  levels  of  a  number  of 
'predictors',  or  input  variables,  that  affect  it"  (Box  and  Draper,  1987;  1). 


Inputs 

Outputs 

Blue  Systems - 

— 

Red  Systems - ► 

CORBAN 

_ _^Y2 

Terrain - ► 

Scenario - ► 

►  Y3 

Figure  I.  CORBAN  with  All  the  Inputs 

The  most  common  and  simplest  method  of  using  a  combat  model  is  to 
establish  a  base  line  case,  and  then  to  add  each  new  weapon  system  one  at  a 
time,  measuring  the  changes  in  combat  effectiveness.  These  changes  from 
the  base  line  case  measure  the  amount  a  weapon  system  contributes  to  the 
outcome  of  the  battle.  While  this  method  measures  the  contribution  of  each 
individual  weapon,  it  does  not  allow  the  determination  of  the  additive  effect 
of  weapon  systems,  i.e.,  if  an  attack  helicopter  raises  the  value  of  an  MOE 
by  "x"  and  a  tank  raises  the  value  by  "y",  then  it  is  not  true  that  if  both 
systems  are  present  the  resulting  improvement  would  be  "x+y". 

The  ideal  solution  would  be  to  explore  all  possible  combinations  and  find 
the  combination  of  systems  that  yield  the  greatest  increase  in  the  MOEs 
values.  While  this  method  is  practical  with  small  situations,  the  number  of 
combinations  grows  quickly.  If  one  had  to  explore  every  combination  of  40 
different  systems,  the  number  of  potential  runs  would  be  240,  or  109.9  billion 
runs. 
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RSM  represents  a  compromise  between  the  process  of  replacing  weapons 
one  at  a  time  and  the  ideal  solution  of  doing  every  combination.  This 
compromise  is  a  fractional  design.  This  technique  is  based  on  taking 
specific  combinations  of  the  total  combinations  possible.  The  combinations' 
results  are  then  averaged  to  find  an  estimate  for  a  system's  contribution  to 
the  MOE. 

RSM  does  allow  the  measurement  of  the  additive  effects  of  weapons. 

This  methodology  fosters  the  construction  of  a  "design  matrix"  that  varies 
the  inputs  in  an  efficient  manner  so  that  a  linear  model  can  be  built  to 
forecast  the  effects  of  the  systems  with  respect  to  the  outputs.  A  set  of 
coefficients  is  computed  which  are  the  mean  or  average  improvement  given 
the  new  weapon  system.  These  coefficients  can  be  used  in  an  additive 
estimate. 

In  the  following  example,  we  hold  the  Red  weapon  systems,  the 
terrain,  and  the  scenario  input  variables  constant  and  vary  the  Blile 
systems  that  we  want  to  study.  This  technique  reduces  the  inputs  to  the 
black  box  to  a  more  manageable  size  as  illustrated  in  Figure  2. 


Inputs 


X 


09 

5 

O 

6  X 
8 

a  x 


Outputs 


CORBAN 


Figure  2.  Controlling  only  the  Desired  Input 

A  general  linear  model  is  a  method  of  estimating  an  output  or  dependent 
variable,  Y,  whose  mean  is  a  function  of  one  or  more  independent  variables 
(xi>  X2,  etc.).  The  general  linear  model  is  of  the  form: 

Y=  bo  +  bixj  +b£X2  +  b3X3  +  .  .  .  +brxr  .  ^ 

or  in  matrix  form, 


Y  =  XB, 
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where  Y  is  an  output  MOE  vector  of  the  model  run  results,  bo  is  the  effect  of 
the  base  case  weapons,  bi  through  br  are  the  effects  of  systems  1  through  r 
in  the  excursions,  and  X  is  the  design  matrix  of  binary  independent 
variables  whose  construction  is  described  below.  This  model  isolates  only 
the  presence  of  main  effects  without  considering  any  interaction  effects. 

Plackett-Burman  designs  are  useful  when  the  problem  of  determining 
the  main  effects  with  maximum  precision  is  reduced  to  a  combinatorial 
problem.  They  are  useful  when  the  problem  has  only  two-level  factors,  i.e., 
when  there  are  low  and  high  variable  setting  or  binary  (0, 1)  variables.  If 
there  is  a  need  to  consider  a  mixture  of  two  and  three  factor  treatments, 
orthogonal  array  designs  may  be  employed. 

Plackett-Burman's  method  specifies  the  construction  of  the  design 
matrix,  X.  This  matrix  represents  a  map  of  all  the  independent  variables' 
values'  for  each  computer  run.  Each  row  corresponds  to  a  specific 
computer  run  and  each  column  corresponds  to  a  different  factor.  In  the 
case  of  VAA,  the  systems  being  considered  for  procurement  are  the  factors. 
The  values  of  the  matrix  elements  are  either  1  or  0  which  represents  the 
presence  or  absence  of  the  system,  respectively. 

To  illustrate  this  coding  scheme,  we  consider  the  following  two  cases. 

The  first  case  involves  a  new  system  replacing  an  existing  system.  An 
example  of  this  case  would  be  the  Squad  Automatic  Weapon  (SAW) 
replacing  the  M60  machine  gun.  In  excursions  where  soldiers  are 
equipped  with  a  SAW,  a  1  would  be  entered  in  the  design  matrix.  In 
excursions  where  the  M60  is  used,  a  0  would  appear. 

The  second  case  involves  a  new  system  that  does  not  replace  an  existing 
system.  JSTARS  would  be  such  a  system.  In  this  case,  a  1  would  indicate 
the  presence  of  the  new  system,  while  a  0  would  denote  its  absence. 

Once  the  design  matrix  is  formed,  each  excursion  is  performed  as 
specified  using  the  combat  model,  and  the  outputs  are  obtained,  forming 
the  Y  vector.  The  coefficients  (elements  of  B)  for  the  linear  model  are 
obtained  by  matrix  algebra.  Then  we  have: 

XB  =  Y.  (2) 

To  solve  for  B,  we  have  two  options.  If  a  full  Plackett-Burman  matrix  is 
used  without  deleting  any  columns  such  that  X  is  of  full  rank,  then: 

B  =  X-lY.  (3) 

If  there  is  a  need  or  desire  to  use  a  reduced  matrix  (modified  Plackett- 
Burman  design),  then  we  use  the  standard  formula  for  GLM: 


(Neter,  Wasserman,  and  Kutner,  1985;  239). 

rouUnesapORTRAN  JnanipiJatioSc!:Sn„be  done  on  spreadsheets,  IMSL 
routines,  FORTRAN  programs,  SPSS,  BMDP,  or  other  statistic  programs. 

ranSlStS^  i8  Hsed  [or  fitting  »  regression  line  along  a 

Wasserman^and  Kutner  WSfrtlS!) Wel'  (Neter' 

worth  of  u  er’  iy°5,  342'-  The  estimator  bo  is  the  measure  of 

contribution  of  “dWd  TT '  ^  tStimatore  bl  throu6h  h  present  the 
MOEs^LER^RUFTfp  .mS  improvement  of  the  Corps-level 
diffp^nffQ’iS’  FE?’  EBB>  ®tc‘  These  coefficients  are  often  called 

S2SSS2S  te™PofTffi?entS  btCaUSG  they  reveal  by  how  m^h  the  value 
intercept cf  egory  that  receives  the  value  1  differs  from  the 
catpcrn™  of,^le  category  that  receives  the  value  zero  The 

category  that  receives  the  value  zero  is  often  referred  to  as  the  Wp  nr 
comparison  category"  (Dillon  and  Goldstein,  1984;  245) 

advantage  of  the  above  formulation  is  that  the  general  linear  model 
rerunning1 tto  7  mixe*  of  systems  without 

m’sktySs  iwatr  “  *"• 

combinations  of  systems  without  conducting  additional  simulation  runs. 

denMdsnoTt.b!r.^°„I?PUte;  T™8  ?e,eded  to  determine  the  linear  model 
j  i  type  of  combat  model  that  is  being  used  A  stochastic 

us?d  allTthe  answers  for  each  random  numbered 

made  for  each  combinnrmg  *n  +  constant;  therefore,  multiple  runs  must  be 
SSt  c°mbmation.  Deterministic  (expected  value)  models  are  not 

^Aofthis  rd/ 18  “ to 

r  SfaaW  t0ta]  nu“ber  *  ^tem  contributions, 

p,lrm„n  mated,  or  at  least  r+1  runs  are  needed  using  the  Plackett- 

1946-  319)  Thl  add  «UbS1t  of  ^lackett*Burman  (Plackett  and  Burman, 
the  basfiJtlnn  ^r1  “  required  10  determine  the  intercept,  b  for 
eouation  '  r+1  ™»«.  coefficients  can  be  computed  for 

“^d)btedonImXt^rfoudrmntseHh  Placke«'Ba™"  is 

of  larger  desitme  to  taii„  "Ples  of  four  runs.  However,  one  can  use  subsets 
equation  (4)  tffind  the  coeffirients8"  matnX'  *”  CaSe8’  °ne  mu8t  use 

output  data  whence  have^eyArt!  re£f  »e  Perfect  fit  (no  residuals)  of  the 
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linear  model's  aptness  can  be  determined.  This  is  necessapr  if  statistical 
tests  are  needed  to  determine  the  significance  of  the  coefficients. 

An  additional  advantage  of  the  use  of  this  method,  is  the  ability  to  add  a 
small  number  of  additional  weapon  systems  later.  The  past  work  is  not  for 
naught,  additional  computer  runs  can  be  added,  and  the  problem  can  grow 
from  the  initial  set  of  runs  (see  example  2),  into  a  more  traditional  multiple 
regression  problem. 

4.  ILLUSTRATIVE  EXAMPLES.  Example  1  (See  the  appendix  for 
additional  computational  details)  demonstrates  the  proposed  methodology. 
In  this  example  there  are  7  systems  (r=7)  that  we  want  to  compare  to  the 
base  case  of  which  four  are  new  direct  fire  weapons  (weapons  are  1-4),  two 
field  artillery  weapons  (FA  weapons  5  and  6),  and  one  new  attack  helicopter 
(helicopter  7).  Using  the  Plackett-Burman  design,  we  set  up  the  design 
matrix,  X.  The  8  (r+1)  computer  runs  are  performed  and,  in  this  case,  the 
output  of  Blue  Effective  Battalions  Remaining  (EBRs)  MOE  is  used,  forming 
our  output  vector,  Y.  Using  equation  (3)  our  linear  model  becomes: 

Y2  =  3  +  18.5xi  +  18.5x2  -  3x3  +  8x4  -  14x5  +  18.5xg  +  30x7  (5) 

where  bo  =  3,  the  value  of  the  base  case  systems,  and 

{X  if  weapon  i  is  present, 

0,  otherwise. 

With  the  above  general  linear  model,  one  can  see  that  the  largest 
improvement  is  incurred  when  the  Attack  Helicopter  is  introduced, 
followed  by  a  three  way  tie  for  the  next  largest  contribution  among  systems 
1,  2, and  6.  The  next  important  weapon  is  4.  Finally  note  that  systems  3  and 
2  actually  decrease  the  ability  to  obtain  a  higher  EBR  score. 

Once  the  study  is  done,  the  sponsors  wish  to  add  another  system.  As 
shown  by  Example  2,  this  can  be  done.  The  disadvantage  of  adding  a  new 
system  is  that  there  is  a  loss  of  the  structure  of  the  original  orthogonal 
design.  This  orthogonal  design  gave  us  advantages  of  balance  and  coverage 
of  the  sample  space.  The  same  basic  methodology  can  still  be  used  since  the 
problem  is  still  a  multiple  regression  problem. 


5.  PROBLEMS  USING  THIS  TECHNIQUE.  This  procedure  is  not  a  cure 
for  input  errors  and  intense  weapon  system  interactions,  nor  does  it 
immediately  help  to  explain  counter-intuitive  results.  Thorough  analysis  is 
still  required  to  check  the  answers,  and  then  serious  thought  is  needed  to 
interpret  the  results. 

When  this  methodology  was  first  used  in  the  VAA  study,  the  discovery 
of  input  errors  was  more  difficult.  When  the  traditional  method  of  combat 
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The  Snf»wSed'  °,kly  °ne  8,ystem  is  ^justed  at  a  time  from  the  base  case. 

Sfe  ble  cal  ^H  e£!sily  ““Pare  the  results  from  their  excursions  to 

Wder  L  finH  e  ***?  d‘fferences  and  find  any  errors.  It  is  much 
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IntaitSn SuST™  “f08?  *hen  some  system  coefficients  were  negative. 

rastl  v  sn  ^HnSL  tmdyl  th,a  the  rePlacement  of  an  old  system  with  a 
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tte  a^~?nS%SySt??8-  E/amP'es  are  search  pattern!  fa^X 

r>o^ZTJVP  to  tore  2?em’  fi?ng  rates  set  t0  “ro,  or  an  incorrect 
^  the  weapon.  The  methodology  has  not  yet  in  its  use  in  VAA 

Sta  WHS  W°rSe  f°r/  ,C]early  Buperio *  system  when  ’ 

P  ta  for  the  respective  new  and  old  systems  were  correct. 

werluSd  mfindT,0rH  eckin?’  °thcr  sum“ary  or  intermediate  statistics 
were  used  to  find  problems.  An  example  is  the  number  of  losses  for  earh 

netti™r;LStanCe' tf  th*  Fracti™al  Exchange  RaZfor  a  system  is 

n^her  nf  Rl,^1JOr  comPonents  of  FER,  the  number  of  Red  losses  and  the 
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particular  situation,  a  database  system  is  invaluable  to  manipulate  the 
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large  amount  of  data  needed  to  perform  these  analyses.  Explaining  and 
verifying  counter-intuitive  results  is  a  difficult  and  tedious  process. 


6.  SUMMARY.  The  Plackett-Burman  designs  are  ideally  suited  for  use 
in  the  VAA  methodology.  The  requirement  to  evaluate  multiple  systems  in 
various  combinations  would  be  impossible  in  a  timely  manner  if  combat 
simulations  would  be  needed  for  every  conceivable  grouping  of  weapon 
systems.  Thus,  the  use  of  RSM  to  form  general  linear  models  will  allow 
VAA  to  be  a  responsive  tool  needed  in  the  PPBES  process.  While  there  are 
many  benefits  to  this  statistical  approach,  this  method  is  not  a  panacea  for 
combat  modeling.  There  is  still  a  great  requirement  for  thorough  and 
detailed  analysis  to  be  done. 
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appendix 


Example  1 


Tv^Suex?Tp^e  is  done  usin9  a  software  packaqe  called  mathcad 
Which  allows  one  to  use  matrix  algebra  This  exampie^™S?' 

demonstration  purposes  only  and  does  not  represertactill  data. 


ORIGIN  =  1  PROPOSAL  FOR  DETERMINING  COEFFICIENTS 


DESIGN  MATRIX 


1111 
10  11 
10  0  1 
110  0 
10  10 
110  1 
1110 
10  0  0 


0  1  0  O' 
10  10 
110  1 
1110 
0  111 
0  0  11 
10  0  1 
0  0  0  0 


X  :=  DESIGN  MATRIX 


BASED  ON  THE 

PLACKETT-BURMAN 

DESIGN 


Y 


23 
45 

24 
34 
56 
67 
78 

3 


Run  # 
1 
2 

3 

4 

5 

6 

7 

8 


Representative 
i.e.  Effective 


output  from  the  CORBAN  model 
Battalions  Remaining 


-1 

B  :=  (XTX)  XtY 


B  = 


3' 

18.5 

18.5 

-3 

8 

-14 

18.5 


30 


MEAN  Value  of  the  Base  Case 

WPN  1 

WPN  2 

WPN  3 

WPN  4 

FA  1 

FA  2 

ATK  HELO 


Check  of  the  fit  of  the  model 
Y1  :=  XB 

S(Y  -  Yl)  =  0 
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Example  2 


In  this  example,  we  have  completed  our  computer  runs,  and 
now  have  a  new  weapon  system  to  add  in.  While  we  will  lose 
some  of  the  benefit  of  the  original  orthogonal  design, 
we  can  still  treat  this  as  a  multiple  regression  problem. 


In  this  example,  we  add  an  additional  column,  and  repeated 
the  first  three  rows  of  the  design  and  with  the  presence  of 
the  new  weapon  system. 


ORIGIN  m  1 


DESIGN  MATRIX 


11110  10 
10  1110  1 
10  0  1110 
110  0  111 
10  10  0  11 
110  10  0  1 
1110  10  0 
1  0  0  0  0  0  0 
11110  10 
10  1110  1 
10  0  1110 


0 

0 

1 

0 

1 

1 

1 

0 

0 

0 

1 


#  of 


Runs 

BASED  ON  THE 
ORIGINAL 
PLACKETT-BURMAN 
DESIGN  WITH  AN 
ADDITIONAL  COLUMN 
AND  THREE  MORE  ROWS 


X  :=  DESIGN  MATRIX 


Y 


B 


'23' 

1 

45 

2 

24 

3 

34 

4 

56 

5 

67 

6 

78 

7 

3 

8 

33 

9 

55 

10 

33 

11 

-1 

The  output  matrix  with  the  three  additional 
results  from  the  three  additional  computer  runs. 


(XTX)  XT-Y 


B 


3 

18.583 

18.667 

-3 

7.917 

-14.083 

18.583 

29.833 

9.667 


MEAN  of  the  Base  Case  Weapons 
WPN  1 
WPN  2 

WPN  3  This  is  our  new  coefficient  matrix. 

WPN  4  Notice  the  small  changes  in  the 

FA  1  coefficients  when  compared  to 

FA  2  Example  1. 

ATK  HELO 
NEW  WPN 
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We  will  now  check  the  results  of  the  model, 
that  we  have  more  points  than  coefficients 
be  some  variance  in  the  model. 


Realize  now 
so  there  may 


X-B 


'23.167 

45.167 

23.667 

34 

56 

67 

78 

3 

32.833 

54.833 
33.333 


E  :=  Y  -  (XB) 

The  variance  is  represented  by  the 
error  of  the  predicted  results,  Yl, 
to  the  actual  results ,  Y . 


Check  of  the  fit  of  the  model 


j  :=  1  ..rows(X) 


Yl  :=  XB 


Y  -  Yl 


-0.167' 

-0.167 

0.333 

-14 

-3.55310 

-14 

-8.527-10 

-14 

-5.684-10 

-14 

-5.684-10 

-14 

6.306-10 

0.167 

0.167 

-0.333. 

Z(Y  -  Yl) 


0.4 
E 

j 

-.4 

0  j  12 

This  graph  shows  the 
distribution  of  the 
error,  Yl  -  Y. 


-13 

-5.871-10 


In  this  example  the  error  term 
due  to  round  off  error. 


is  not  exactly  zero 
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LECTURE  1 


Descriptive  Statistics  and  Tests  For  White  Noise 


"T ti"*'  ‘hat  “' with  ,ta* 

the  basic  descriptive  techniques  used  in  time  series  analysis.  f  ^  CCtUre  “  to  introduce  some  of 

1.1.  Time  Series  Data  Types 

Tin*  series  c  be  cMM  in  m™,  indudin,  by  the  folio™,  fon,  d^eurirtfcr 

1  «  ™£d"  2'  ££  SI?,**  XT*  r 

.hil.  wh«,t  yields  recorded  over  .  r^T.rid  of  otkL'  L.T  Q  «hV?*  index  set. 

set.  This  second  example  also  illustrates  that  the  index  J“gf  ,fied  ^ave  a  two-dimensional  index 
“position.”  that  the  mdex  8et  “«d  “ot  literally  be  “time”  but  can  also  be 


2.  Whether  the  index  set  is  continuous  or  diRr«i-  tv  ,  « 

continuous  on.dirnen.io..,  series,  „bUs  dsily  kirth  d  “  “  ^  »  . 

*  each  time.  For  example,  a  Zfhlv  w  T  ^  &  Vector  of  ">«*•  meiured 

one-dimensional  bivariate  time  series.  *  mtercst  rates  and  P058  national  product  is  a 

continaous;  thrtk^J  oT*  wntbu^of"^"*1!"  Continuou8  M«t  «ries  are 

“  is  a  binary  time  series,  one  in  which  an  "VT’o — 

will  be  almost  exclusively  concern^wi^  *  irregularly  spaced  in  time.  We 

tune  but  continuous  in  space,  and  univariate.  dimensional,  equally  spaced,  discrete  in 

1.2.  Time  Series  Memory  Types 

•ame  phenomenon^  ”  Becaui^of  th"^^  repeated  measurements  on  the 

correlation  between  successive  observations.  This  LTLLve/’  ?  ^  mU8t  take  fato  the 

areas  of  statistics  where  one  assumes  that  the  daU  tre  n^S  nJ  T  T  "1  the  d?ta  maiyzed  in  many 
observations  obtamed  by  randomly  sampling  some  population  identicaI,y  distributed 

makes  the  analysis  of  time  series  data  and  the  interowtation  of  th? ? ^  a|10ns-  ?he  Presence  of  correlation 
mdependent  case.  rpretation  of  the  results  much  more  difficult  than  in  the 
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Course  Materials 

This  handout  is  based  on  the  book  TIMESLAB:  A  Time  Series  Analysis  Laboratory 
written  by  H.  Joseph  Newton  and  published  by  Wadsworth  and  Brooks/Cole  of  Pacific 
Grove,  California  93950;  (800)  354-9706.  For  more  information,  contact  the  publisher 
or  H.  Joseph  Newton;  Department  of  Statistics;  Texas  A&M  University;  College  Station, 
Texas  77843;  (409)  845-3141;  email:  jnewton@stat.tamu.edu. 

Outline  of  Lectures 

This  tutorial  is  divided  into  six  lectures.  The  aim  of  the  lectures  is  to  introduce  the 
basic  ideas  and  methods  for  analyzing  time  series  from  a  statistical  point  of  view.  They  will 
make  extensive  use  of  the  computer  program — also  called  TIMESLAB — that  accompanies 
the  speaker’s  book.  The  lectures  are: 

1.  Descriptive  statistics  and  tests  for  white  noise.  Numerical  and  graphical  summaries 
of  time  series  are  discussed  and  illustrated  on  ten  typical  time  series.  Two  tests  are 
described  for  determining  if  a  time  series  can  be  regarded  as  having  no  patterns  (white 
noise). 

2.  Transforming  and  forecasting  time  series.  As  in  regression  analysis,  it  is  sometimes 
necessary  to  transform  a  time  series  before  proceeding  with  the  analysis.  Several 
transformations  are  described.  Also,  several  simple  forecasting  methods  are  described. 
These  methods  do  not  use  time  series  models. 

3.  Time  series  models.  The  basic  properties  of  standard  time  series  models  (ARMA, 
ARIMA,  etc.)  are  described.  The  basic  theory  of  covariance  stationary  time  series  and 
their  prediction  are  discussed. 

4.  Estimation  and  model  identification.  Estimation  procedures  and  their  properties  are 
described  for  the  mean  and  autocorrelations  of  a  time  series  as  well  as  for  the  param¬ 
eters  of  the  models  introduced  in  Lecture  3. 

5.  Model-based  forecasting  methods.  The  models  and  estimates  discussed  in  Lecture  4  are 
used  to  find  forecasts  and  forecast  intervals.  Forecasting  using  regression  models  is 
also  discussed. 

6.  Searching  for  periodicities.  The  problem  of  determining  if  cycles  exist  in  time  series 
is  considered  and  illustrated  on  some  famous  time  series,  including  sunspot  data.  The 
general  problem  of  spectral  density  estimation  is  also  briefly  discussed. 
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DESCRIPTIVE  STATISTICS  AND  TESTS  FOR  WHITE  NOISE 


t  Daily  California  Births 


700, 


Vt  Monthly  Airline  Passengers 


LECT.  1 


140 


LECT.  1 


DESCRIPTIVE  STATISTICS  AND  TESTS  FOR  WHITE  NOISE 


In  this  section  we  classify  time  series  into  three  broad  classes  according  to  what  we  will  call  their  memory 
type: 

1.  Purely  Random  Series.  This  type  of  series  shows  no  patterns  over  time.  Series  III  is  an  example 
of  such  a  series.  It  was  generated  by  TIMESLAB  using  a  random  number  generator  so  that  it  would  be 
indistinguishable  from  a  random  sample  from  a  standard  normal  distribution.  Series  IV,  on  the  other  hand,  is 
a  real  data  set  (monthly  total  rainfall),  but  appears  quite  similar  to  Series  III.  We  will  see  later  that  such  data 
are  aptly  named  white  noise.  Purely  random  series  are  also  called  no-memory  series,  as  one  characterization 
of  statistical  independence  is  that  an  observation  at  one  time  has  no  memory  of  the  observations  at  any 
other  time. 

2.  Long-Memory  Series.  This  type  of  series  is  the  opposite  extreme  of  white  noise;  that  is,  a  plot 
of  the  data  looks  to  be  almost  that  of  a  deterministic  function  of  time.  Series  V  and  VI  illustrate  this  type. 
The  first  was  artificially  generated  as  values  lying  on  a  cosine  curve  that  goes  through  ten  cycles  with  small 
random  numbers  added  to  each  point.  Series  VI  is  a  real  economic  time  series  (monthly  total  international 
airline  passengers  for  12  years).  These  two  series  have  in  common  that  both  could  be  almost  perfectly 
extrapolated  far  into  the  future  unless  something  were  to  happen  to  the  mechanism  generating  the  data. 
This  is  the  origin  of  the  term  “long  memory”  The  dependence  on  the  past  does  not  die  away  quickly.  Note 
that  many  of  the  time  series  in  business  and  economics  are  long  memory. 

3.  Short-Memory  Series.  This  type  lies  between  white  noise  and  long  memory,  occurs  often  in 
the  physical  and  engineering  sciences,  and  comprises  the  bulk  of  the  time  series  that  can  be  most  usefully 
analyzed  by  the  sophisticated  methods  of  time  series  analysis  that  we  will  study.  Series  VIII  and  IX  appear 
to  be  short-memory  series;  clearly  observations  close  together  in  time  are  more  similar  than  those  far  apart 
in  time,  but  there  are  no  apparent  deterministic  patterns  in  the  data  (although  upon  closer  inspection  you 
might  be  able  to  tell  that  Series  IX  is  actually  the  sum  of  four  cosine  curves).  In  a  short-memory  series  the 
predictability  of  the  observations  at  one  place  in  time  from  past  observations  appears  to  die  out  quickly  as 
time  goes  on. 

1*3.  Basic  Descriptive  Statistics 

The  first  aim  of  any  statistical  procedure  is  to  give  a  succinct  description  of  the  data  being  analyzed, 
both  graphically  and  numerically.  In  time  series  analysis  there  are  three  basic  graphical  techniques  for 
describing  data:  the  correlogram,  the  partial  correlogram,  and  the  periodogram.  In  this  section  we  introduce 
each  of  these  quantities  in  turn  and  illustrate  how  they  are  used  to  describe  data  x(l), . . . ,  x(n). 

1.3.1.  The  Sample  Correlogram 

The  distinguishing  characteristic  of  a  time  series  is  that  it  can  exhibit  serial  correlation,  that  is,  corre¬ 
lation  over  time.  For  example,  Figure  1.2  contains  scatterplots  of  x(f)  versus  z(t  -  1)  for  t  =  2,...,n  for 
Series  I  and  II.  Note  that  for  Series  I  there  appears  to  be  little  correlation  in  the  plot,  while  in  Series  II  there 
appears  to  be  high  positive  correlation. 

In  a  time  series  x(l), . . . ,  x(n),  we  usually  want  to  measure  the  correlation  of  the  data  with  themselves 
except  “lagged”  a  certain  number  of  time  units.  Thus  for  a  lag  v,  we  have  n-  v  pairs  of  x’s  that  are  separated 
by  v  time  units,  namely  the  pairs 

(*(1),  *(1  +  v)) ,  (x(2),  x(2  +  «))*•••*  (*(n  -  »>),  x(n)) . 


The  traditional  way  to  measure  serial  correlation  in  time  series  is  by  the  sample  autocorrelation  coefficient 
as  given  in  the  following  definition. 
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X 


400 


Figure  1.2.  Scatterplot  of  *(<)  versus  x(t  -  1)  for  Series  I  and  II. 

. *M  **  *  *»<  Ti.  .mpk  tuIcccrrdtlioD 

p(v)  =  (*(0  ~  *)  (*(<  +  p)  -  fl 


2T«=i  (*(0  -*)J 


t>  <  n, 


where  x  is  the  samp/e  mean  of  *(1), . . .  ,,(„).  A  p]ot  ofM  vforv-01  w  , 
fag  Af  is  ca/fed  tie  correlogram  of  the  da  ta.  for 


some  maximum 


To  illustrate  the  use  of  the  correlogram,  consider  1 EWe ? 3 wW  i*  =  <W* 
the  ten  series  introduced  earlier.  A  large  value  of  o(v\  »  fn,r  #•  h*rc  WC  ^*vc  *lvcn  the  corrdoP«ns  for 
tin.  nmu.  Fo,  ,„n,pl«,  ,h.  ?  *  p“*ible  P«ioJWty  in  U.  d,u  of  , 

VI  exhibit  this  behavior.  g  '  P  CUrVe  18  ■**  «»“s°»dal.  Notice  that  Series  V  and 

the  series  is  start  (long)  m^r^Note^harfhrwheat  mi"0  ?d  thf°  8l&^.t!lere  “  •  increases  means  that 
the  random  walk  series  appear  to  be  long  memory  by  inspec W  th «  ^  ***  “* 

1.3.2.  The  Sample  Partial  Correlogram 

-—7  -«<•.  if « h.v,  aits:  si  ■v  wi*M“«  ?  * 

residuals  ey  and  eg  of  regressing  y  on  the  X'%  and  7  *k  v»  • 1  ^  wc  cao  find  the 

coefficient  between  ey(l), . . .  ,er (n)  and  e*(l)  evfnl  Th  *  a“<1,.t,1.en  fi“d  l.he  usual  ,amP,e  correlation 
correlation  coefficient  between  V  and  Z  given  Jf,  ’  *  jr  “  ,what  “  c.alled  the  “mple  partial 

multiple  correlation  coefficient  !&,-  i.  the  nronortinn  ’  t  "Jr**0-  denoted  Pyz\xt-xr-  The  sample 
telationship  with  the  ^  th*  proportlon  of  lability  in  Y  that  is  ‘explained*  by  its  linear 

autocorrelation  coefficients  and  residud\r«foLces*  ^Tamoi**  ^  !*”**  “alys“.“e  the  8amP,e  P&rtial 
is  the  correlation  between  *(f)  and  x(t  +  v)  after  havinr  M  P  *  ^  *utocorre,ation  coefficient  of  lag  v 

between  (the  lag  one  partial  fo  “at  the  fTT  Unw  effect  of  the  d*a  » 

autocorrelation  by  tf.  We  wij  ^aDy  dfonlav 1TJ  ltOCO"e,ation).  We  denote  the  •  Partial 
residual  variances  divided  by  the  sample  iriwce  ThlT^fo*  *t“dJrd,2ed  r“,du*l  variances,  that  is,  the 
•mafi  (large)  value  indicates  that  x(t)  is  (is  not)  very  nredictata.fr  ^7^’  **  ketweenzero  «»d  one,  and  a 
are  long  (short)  memory.  A.  useful  rule  of  thumb  is  that  d»*  i  °m  l4>  P#St*  l^US  “Seating  that  the  data 
variance  sequence  becoL. Z th»  8/„  ^  * tbeir  »t“d«di*ed  residual 

«*«-  a.  ««h  of  a. 

that  usual  partial  correlations  are  used.  In  Series  VII  for  erminl*  th*11^  &“tocorre,atlon8  “  the  same  way 

nenes  VII,  for  example,  the  correlogram  takes  some  time  to  decay 
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Figure  1.3.  The  Correlograms  for  Series  I-X. 
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»  "eton  betw«n"m  toMt* Tk  j"**  ?\l\“e  “d  th'"  ,ma"  b!""fo'lb  This  indicates  that  the 
bet  een  z(<)  and  *(< + 2)  u  due  only  to  the  common  relationship  of  z(t)  and  *(t  +  2)  to  x(t  + 1). 


1.3.3.  The  Periodogram  and  Sample  Spectral  Density 

.  ""  “  ^  “  *ll“  “  “»*  «“ 

senaratelv  FWtlw.,  f„  ,  •  *  f  “t0  the*«  sunPle  frequency  componenU"  and  studying  them 

"  *  “““  "“>*  •*-*  «  »■>««*<  ««  independent  of  Jneh ZZ 

apasss=^SF.i^»«« 

the  form  of  a  decompos.tion  into  independent  sinusoids  that  can  be  studied  separately.  * 

Definition.  A  sinusoid  gk(x)  at  period  k  (or  frequency  1/k)  is  the  function 

ft(*)  =  oco«~£  -Msin~,  *€(-oo,oo). 

Since  co.  and  sin  have  period  2*.  we  have  that  *(*  +  kl)  =  gk(x),  for  any  integer  /.  We  can  also  write 

#k(*)  =  Ccoa(~i  -  4)f 

Wher!  I:  VS^  “d  V 1  "Ct“(l/fl)  “e  C4Ded  th«  amplitude  and  phase  of  gk. 

analysis  of  ."hi  ^ntLl“Tet  XTm  '  “"1?.  °f  '*!  b"™^ 

ST°da  If!/".  *»  H  denote*  ,h. 

^r“h.drL^ind'rrr  "d  *•—. *»« —w 

the  nun  of  Ute  abated  »Zt^th,1inu»id.  '  **  •» 

Uana^m  of  a*art  ofnu^ew"0^  «f  *  *»  <*»*  «*.  -a  define  the  dbccete  Fonrie, 


“e  “  ™*k*>  -OX . M  a  .fie 

z(i) ■ E  •Me*'*'-1**  =  E  *(o «*  2*«  - 1  v* + 1  £  *(o  «n  2*(t  -  lv* , 


Figure  1.4.  The  Partial  Autocorrelations  (Solid  Curve)  and  Standardized  Residual  Variances  (Dotted 
Curve)  for  Series  I-X. 


descriptive  statistics  and  tests  for  white  noise 
Table  1.1.  Interpreting  the  Periodogram 

-PPgU"»“  °<  D“*  Nature  of  Ptriodogrjun 

Smooth 


Wiggly 

Random  (no  pattern) 

Basically  sinusoidal 
of  period  p  time  units 

Periodic  of  period  p 
but  not  sinusoidal 

where  vk=(k  l)/n,  4  =  1 . n. 


Excess  of  low  frequency;  that  is,  amplitudes  of 
sinusoids  of  low  frequency  (long  period)  are 
large  relative  to  other  frequencies 

Excess  of  high  frequency 

No  frequencies  dominate 

A  peak  at  frequency  1/p 

A  peak  at  fundamental  frequency  1/p  and 
peaks  at  some  multiples  of  1/p  (harmonics) 


lect.  1 


Theorem  1.3.1  Sinusoidal  Decomposition  of  Data 


JSSff  —  fc‘  '<» . .(.)  k  a.  DFT  or.  For  *  = 


a*  n  ^(z(^))>  k*  =  —Im(r(/t)). 

Let&,  =  $  Et"=1  (*(0  -  *)J,  and  for  t  =  1, . . . ,  „  define 

9k(t)  =  as  cos  2r(t  -  ljw*  +  bk  sin  2 »(<  -  i)y4 
=  C*  cos(2t(<  -  l)w*  -  arctan(4*/at)]. 

Then 


»,=o,  c;  - 11, 

d)  Far  1*2,...,  [n/2]  -I- 1,  we  have 


“*  *  k*  =  -k„_»+2 

cos  2x(f  -  l)w*  ss  cos2tw„_4+j 
sin2r(t  -  l)wt  =  -sin2jra»„>k+j 

snd  thus 

M*)  =  9n-k+*(t),  Cl  = 


3C5 


LECT.  1 
which  gives 
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1(1) 


i»e£3 

Ues 

*■-{ 


[n/2]+i  gk(f)t  n  odd 

5i(0  +  5(«/2)+l(0.  neveD 


tztff*1#,  nodd 

2Effl  Cj  +  Cf./aj+p  neveD- 

e)  For  k  =  1, . . . ,  [n/2]+l,  tiie  vectors  gt  =  •  •  •  ,9k(n))T  are  orthogonal,  that  is,  g  jgk  -0,j^  k. 


to  he  (as  measured  by  *>)  in  terms  of  the  sum  of  squared  amplitude  ol  tne  sinusoios  IP-MWJ-  » 
lerms  of  understanding  why  the  x’s  vary  then,  these  squared  amplitudes  play  an  important  role.  Note  that 

C\  =  |z(Jk)|2/nS- 

Definition.  For  a  time  series  data  set  x(l), . . . ,  x(n),  let  uk  =  (k  —  l)/n,  k  =  1, . . . ,  [n/2]  +  1,  and  define 

C2  =  4lE*(0^(‘-^|2.  4  as  1, , [n/2]  +  1. 

n  t=i 

A  plot  ofnCl  versus  uk  is  called  the  periodogram  of  x.  The  function 

x(t)ea'<(*-1)w|J,  w  €  [0, .5] 
w  €  [.5, 1] 

is  called  the  sample  spectral  density  function  ofx. 

Note  that  the  periodogram  is  the  sample  spectral  density  evaluated  at  the  so-called  natural  frequencies 

Wl,  «2,  •••,  <*»[n/2]+l- 

Interpreting  the  Periodogram 

In  Figure  1.5  we  give  a  plot  of  three  data  sets  of  length  200  (and  the  log  of  their  periodograms)  that 
were  constructed  by  _  „  2ir(<  - 1}  ,  £  2 w(t  - 1) 

*(t)  =  a  cos  ^  cos «  cos 


/(w) 


l /(!-«). 


100 


10 


tUt  »,  »  the  «un  rf  P»K  ««,«.  of  fc«,oonci»  1/100, .1/10,  «d  1/4.  The  «”  obtai"'d  ^ 

varying  a,  0,  and  6  (Series  1  has  (10,3,1),  Series  2  has  (3,3,3),  and  Senes  3  has  (1,3,10)). 

y  Recall  that  a  sinusoid  of  frequency  u  is  the  sum  of  a  sine  and  a  cosine  of  that  frequency,  but  for  simplicity 
we  are  considering  sinusoids  that  have  no  sine  part.  A  sinusoid  of  long  penod  (low  frequency)  is  very  smooth 
in  appearance  relative  to  one  of  short  period  (high  frequency).  Thus  when  or  is  large  relative J 
(su<h  as  Series  1),  we  would  expect  x  to  be  relatively  smooth  in  appearance;  that  is  the  long-term  rise  and 
fall  of  the  data  should  be  large  relative  to  short  term  oscillations.  On  the  other  hand  Series  3  aPPearsqm 
wiggly  since  7  is  large  relative  to  o  and  0.  In  Series  2  we  have  used  equal  values  of  the  three  coefficients  and 
the  resulting  data  are  between  Series  1  and  3  in  terms  of  wiggliness. 
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FigUre  1,5‘  Suma  0f  Pure  <***”  Md  Their  log  Periodograms. 

time  series  data  seU «e  lof^/ip ,  *  the  comP°nent  sinusoids.  In  general,  of  course 

“  TC«  IVZ  S’jL’Stff  ,turM‘  SK  “  Bo"v"'  *“  *•  *bove 

univariate  series  that  we  have  been  for™  of  the  Periodograms  of  the  ten 

»d  ho.  the  perrodopem  ia  able  to  tell  thet  Seri,,  S  ““h 

Displaying  the  Penodogram 


It  often  happens  that  a  few  vain*.  _  •  . 

Uiem  in  the  plot.  Plotting  the  (iJiumIV  L  rf  S^d  *“  ' *"{  •"*«  “Mve  to  the  eeat  end  thtte  d.erf 

that  several  such  plots  can  be  compared  in  a  meanmrfnl  Afc*  Penodogram  on  some  standard  scale  so 

-  by  the  Memta,  theorem,  ,hich  Mo.. 
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t  Daily  California  Births 


Vfc  Monthly  Airline  Passengers 


Figure  1.6*  The  Log  of  the  Standardized  Periodograms  of  Series  I-X. 
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descriptive  statistics  and  tests  for  white  noise 
Theorem  1.3.2  Standardizing  the  Periodogram 

Let  nCl  fork  =  1 . „  be  the  periodogram  ordinates  of  a  time  series  z(l) . *(„) 

;EL«M0  -*)’•  Then 


LECT.  I 


let  cr*  = 


£ AnC?  _ 

n  A  a 3  -  A* 


that  k,  the  average  va/ue  ofnCl/e 7  is  one. 


Because  of  this  theorem  we  will  routinely  display 


,  fnCl\  k- 1 

*\*-J  VereU*  —  *  =  l,...,[n/2]  +  l, 

k pe2lr*th2  !™oH«  tW -“d  6  0n  thC  V"tiCal  "d*‘  We  WUI  Wely  eDC0Unter  4  of  nCV*7  ^at 


The  Sample  Spectral  Distribution  Function 


trr  . *<"»  -  «*• 

/(«*)  =  £?sl  Jt  _  ! 

EpKP  . 

Then  P(wi), . . . ,  F(w,  )  k  called  the  sample  spectral  distribution  function  oft. 

^vTpenX^  ofV"No“ 

0<F(U.*)<1,  hs=l . q, 

z.*^4-  ’ffl  ■— — *•  •"*“ »“  *»•*  — i 

jTeDCJ[  tha*  the  cumulative  periodogram  starts  out  above  (below)  the  line  y  =  2x  before  caUhk] 

&  n*rfc*sst~*r  “*  jT*  b  »•  A££S552 

(Stnee  III  and  IVJwm,  Httl.  from  ft,  lint  y  i’s*^  *  '“,n“l*“,e  P'nodoP™i  of  the  .bite  noiee  eeriee 
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Figure  1.8.  Example  of  White  Noise  Test. 

U.4.  The  Relationship  of  the  Correlogram  and  Periodogram 

Recall  that  the  .ample  autocorrelation  coefficient  of  lag  „  for  data  *  is  given  by 
p(v)  -  £«ViW  (*(<)  ~  i)  (z(t  + 


Then  p(v)  can  be  written  as 


ETsi  (*(0  -  *) 


W  =  ^)'  w<», 


M  <  n. 


if  we  define  the  sample  autocovariance  function  R  by 


1  n~M 

“  n  E  “  *)  (*(<  +  «)-«),  |v|  <  n. 

*=Sl  * 


We  note  that  it  can  be  shown  that 


•s-(n-l)  Jo  ’ 


,J  »»ww 


The  Past  Fourier  Transform  (FFT) 

To  calculate  the  discrete  Fourier  transform 

*=i . 

S.1  '  1  9 
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of  data  r(l),. ..  ,x(n)  appears  to  require  na  multiplications  and  additions  (n  for  each  of  the  n  values  of  k) 
and  many  evaluations  of  complex  exponentials.  In  the  mid-1960s  various  researchers  made  use  of  a  variety  of 
trigonometric  identities  to  obtain  algorithms  called  fast  Fourier  transform  (FFT)  algorithms  that  essentially 

require  only  n(pi  -I - h  pt)  multiplications  and  additions,  where  pi,  ■  • • >Pt  are  the  prime  factors  of  n,  and 

a  greatly  reduced  number  of  evaluations  of  complex  exponentials.  For  example,  if  n  =  1024  =  2  ,  then  the 
number  of  operations  is  1024(2  +  •••  +  2)  =  10  •  2  •  1024  =  21ogj  1024,  which  is  approximately  50,000  as 
opposed  to  10243  ~  1,000, 000,  a  savings  of  a  factor  of  about  50. 

Note  that  if  n  is  not  very  composite,  that  is,  it  has  some  large  prime  factors,  the  FFT  saves  very  little 

over  a  straightforward  DFT. 

1.4.  Testing  for  White  Noise 

The  first  inference  that  one  should  make  about  an  observed  time  series  is  to  whether  or  not  it  could  be 
considered  to  be  a  realisation  from  a  white  noise  process.  In  this  section  we  consider  two  graphs  that  are 

useful  for  making  such  an  inference.  If  x(l) . x(n)  is  a  random  sample  from  a  population,  then  for  large 

n: 

a)  The  correlations  p(l),...,p(m)  are  independent  and  identically  distributed  as  N(0, 1/n)  variables. 
Thus  there  is  approximately  a  95  percent  chance  that  an  individual  p(v)  will  be  outside  of  ±1.96/>/n.  To 
produce  simultaneous  confidence  bands  having  95  percent  confidence  level,  we  must  construct  individual 
intervals  having  level  .951/m.  This  is  what  is  done  in  the  first  graph. 

b)  The  cumulative  periodogram  has  a  95  percent  chance  of  falling  entirely  within  the  lines  y  =  2x  ± 
l-36/>/5,  where  q  =  [h/2]  + 1.  The  second  graph  is  the  cumulative  periodogram  with  these  two  lines  and  the 

line  y  =  2x. 

In  Figure  1.8  we  give  the  result  for  two  data  sets;  the  first  being  a  normal  white  noise  series  of  length 
100,  and  the  second  being  the  same  series  with  a  cosine  of  length  100,  amplitude  .5,  and  period  4  added  to 
it.  Notice  that  none  of  the  boundary  lines  are  crossed  for  either  series,  but  that  the  cumulative  periodogram 
seems  to  increase  rapidly  at  u  =  .25,  something  that  is  unusual  and  should  be  a  hint  that  this  data  set  is  in 
fact  not  white  noise. 
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Transforming  and  Forecasting  Time  Series 


2.1.  Transformations 


Some  of  the  time  series  analysis  techniques  that  we  will  introduce  in  later  lectures  assume  (1)  that 
the  data  being  analyzed  have  no  deterministic  trends  or  cycles,  and  (2)  that  the  variability  in  the  data  is 
constant  over  time.  The  traditional  method  of  analyzing  data  that  fail  to  meet  these  requirements  is  to  do 
the  analysis  m  three  steps:  (1)  try  various  transformations  until  the  result  appears  to  meet  the  requirements, 
(2)  analyze  the  result  of  step  1,  and  (3)  do  the  inverse  operation  of  what  was  done  in  step  1.  This  strategy  is 
very  similar  to  that  used  m  regression  analysis.  In  this  section  we  will  describe  some  of  the  transformations 
that  are  often  used  in  time  series  analysis. 

2.1.1.  Stabilizing  Variance 

Suppose  that  the  variability  in  a  data  set  *  appears  to  be  increasing  as  time  increases  (see  the  airline  data 
for  example),  u  the  mean  level  of  the  data  is  also  increasing  with  time,  then  the  variability  in  y(f)  =  logift) 
should  appear  fairly  constant  over  time.  This  is  a  fairly  common  occurrence  in  real  data.  In  general  one 
might  try  various  power  transformations  to  obtain  a  series  having  constant  variability;  that  is, 

V(0  =  (*«)'• 


2.1.2.  Removing  Trends 


Another  common  phenomenon  in  time  series  data  is  that  the  values  appear  to  be  growing  in  some 
polynomial  fashion  with  time,  particularly  linearly;  that  is,  the  data  appear  to  follow 


*(<)  =  a  +  bt  +  c(f), 

where  «(t)  is  white  noise.  Such  polynomial  trends  are  often  removed  by  using  differencing. 


d,ffereDCe  *  tbne  xrics  *  having  n  elements  is  a  data  set  y  having  n-d 

y(t)  =  *(t  +  d)-*(t),  t  =  l . n -d. 


It  is  easy  to  see  that  if  *  contains  a  general  dth-degree  polynomial  trend,  applying  first  differencing  d 
tarns  will  remove  it,  while  if  *  has  a  cycle  of  length  s  time  units,  then  taking  ath  differences  will  remove  the 
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Regression  Residuals 


Figure  2.1.  Residuals  from  Regression  for  Series  V. 

Perhaps  the  most  well  known  example  of  differencing  is  the  airline  data  that  we  have  been  analyzing. 
The  variability  in  the  data  is  increasing  with  time  and  thus  a  log  transform  is  usually  applied.  The  result 
of  this  transform  has  an  obvious  12-month  cycle  and  also  a  linear  trend.  Thus  the  traditional  advice  on  a 
series  such  as  this  is  to  take  the  first  difference  of  the  12th  difference  of  the  log  of  the  original  data. 

Regression  Analysis 

A  traditional  method  of  removing  trends  and  cycles  from  time  series  data  is  to  do  ordinary  regression 
of  the  data  on  the  deterministic  functions  of  time.  For  example,  Series  V  of  the  univariate  series  that  we 
have  been  analyzing  was  formed  by 

z(f)  =  10  +  .1* +  3 cos  2T^~  —  +  c(t),  t  =  1,  •••  1 100, 

where  e(t)  is  a  series  having  serial  correlation  If  we  did  ordinary  least  squares  regression  of  the  form 

y(t)  =  /?o  +  Pit  +  fli  cos  2*^0  *  ■  +  <(*). 

we  would  obtain  the  residuals  given  in  Figure  2.1,  which  can  then  be  further  analyzed  by  time  series  analysis. 
2.1.3.  Accounting  for  Seasonal  Variability 

Often  it  is  of  interest  in  seasonal  time  series  to  analyze  how  a  data  set  differs  from  regular  seasonal 
variation.  For  example,  suppose  that  x  consists  of  m  years  of  monthly  data,  that  is,  n  =  12m.  We  can  define 
the  monthly  means  and  variances  to  be  the  means  and  variances  of  each  of  the  12  data  sets  consisting  of  like 
months;  the  Januaries,  Februaries,  and  so  on.  Thus, 

it  =  lf;*(k+i2(<-i)) 
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Figur.  2.2.  A.  Example  of  th,  Moving  Average  Smooth.,. 

^  i  (W  -  N>  fo,  example)  mean.  and  va,ian«„  ...  k. 

2.1.4.  General  Smoothing  Operation. 

wS^t  data  “d  the“  MaIyie  deS^Mfrom  tht  smooth  v  ^  “p  **  “  10  use  «eneraI  methods  of 

W,“,y  ^ UtOUae‘ -"■«  —se smoother,' ~  ^  *  — * 


y(l)  =  £(li±£(2li£(3) 
3 

y( 2)  -  g(2)  +  r(3)  +  rf4) 
3 


y(n  -  2)  =  ?(n  ~  2)  +  «(n  -  1)  4-  z( n) 
3 


Since  consecutive  i/’s  have  two  of  ft,*  .*-  • 

vary  a.  much  as  the  original  *‘s;  that  h  ^  wiTTemnlh11*1*  ^l****’  WOuId  exPect  that  the  y’s  won’t 
generate  100  point,  „ft  ^  Wa™HtadTo».L7~  .•■»««■.. •  in  appearance,  k,  Figrne  J  2  .,  h,v , 
na.J  a  moving ;  average  amootke,  of  length  11  N^that't^  ft0’1)  «hite  none  to  H,  and  then 

data  point,  and  „  «  have  enperimpeJd  th,  «fhTh,o  *.h  ot,^  ‘I™***  nl'u  to  tk.  95th 

that  the  .moothed  da.  .lemfp  oochikl,  th.  coain.  ,u„.  .hW,  b  n«  nh  “  ?'  Not, 

curve,  wnicn  u  not  obvious  in  the  original  data. 
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Airline  Data  and  Extended  Values 


Figure  2.3.  Airline  Data  and  Extended  Values. 

2.2.3.  Simple  Moving  Average 

This  method  models  an  observation  as  a  simple  average  of  the  previous  m  observations  where  m  is  to 
be  chosen.  For  a  given  moving  average  length  k  we  can  calculate 

as  a  measure  of  how  well  the  simple  moving  average  model  of  length  k  fits  the  observed  data.  We  chose  m 
as  the  value  of  k  minimizing  S(k).  Then  we  forecast  future  values  recursively.  For  example,  if  m  =  3  we 
calculate 

*(n  +  1)  =  |  (*(n)  +  *(n  -  1)  +  x(n  -  2)] 
x  (n  +  2)  =  i  [x(n  + 1)  +  x(n)  +  z(n  -  1)] 
x(n  +  3)  =  |  [x(n  +  2)  +  x(n  +  1)  +  x(n)] , 

and  so  on. 

2.2.4.  Simple  Exponential  Smoothing 

Instead  of  modeling  an  observation  as  a  simple  average  of  the  previous  m  observations,  exponential 
smoothing  methods  model  x(t)  as  a  weighted  average  of  all  of  the  previous  values.  The  type  of  weights  that 
are  used  depends  on  the  appearance  of  the  data  and  leads  to  methods  having  a  variety  of  names.  We  will 
discuss  only  the  simple  exponential  smoothing  technique  which  is  most  suitable  for  data  that  appear  to  have 
no  linear  or  seasonal  trends. 
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2.2.  Some  Simple  Forecasting  Methods 


of.  'zrtxi  r?  ■^-v- » *°  *—  <***)  •— 

“sr  sitssts  rr m  t  rsr  * 

(2)  a  forecasting  £t  “  °  be  V‘CWed  “  haVm*  two  P«*.:  (D  *  model-fitting  part,  and 

2.2.1.  The  Inverse  Differencing  Method 

“e“d*  r*  eyc,“ in  d*t*  ^ *>«  * 

n  +  M  were  then  differenced  the  leet  M  values  irk  uea  10  #u  *  **y  i^st  if  the  extended  series  of  length 
the  mee,  rfth,  diffeTTti^^tu  ‘re,  ’ZZTFf?  *°. t*"  vJ““>  *°"M  *"  «£•> 
we  might  detrend  it  hy  applying  first  difference.  We  e.  ki  \i  *  «*j*  ***  *ppea“  10  **aw  *  hneer  trend, 

f <» +!)-.(»)=«, .hem1 , E to „ ““  ? r;^1* rint *<”+»«•«■« 

if  we  continue  this  process,  we  obtain  )>■•*>  x(n)-  Thus  x(n  + 1)  =  z(n)  +  z ,  and 


x(n  +  A)  =  *(„  +  A  -  1)  +  *(  fi  >  j 

and ^th^iffeTences^where”  is  thcTngth^of  the^ondtvf  aSeasonal.7cle’  WC  ^  W1*  both 
we  would  solve  ycle.  For  monthly  data  having  an  annual  cycle, 

*(n  +  1)  -  *(n)  -  *(n  -  11)  +  x(n  -  12)  =  z, 

•here  i  is  the  mesnofthe  12th  difference,  of  .(1) . Thi>  to  the  prtdjctot 


*(n  +  A)ss,(„  +  fi_i)  +  x(n  +  A_12)_t(n  +  fc_13)+. 


A>  1. 


r Thih; sz  zzTZZZZi?**  *?  ^ *•  *-*  <*«• —■ 

represented  only  by  the  solid  curve  The  extended  vJ*  *  **  P°mt’  while  tbe  extended  values  an 
consistent  with  what  we  would  meet  It  TZS  t  n  ™  **  ^  tW0  cycles  on  the  *™Pb  “d  « 
method  on  the  aMne  d*a  as  h  l  l  tZZ*  ^  “  Z  8Urprisin*  that  we  cou,d  «  &-  “naive" 
would  work  well,  including  just  drawing  inthe™^twoTvcTdhWei,WOJ1<1weXPeC-  th&t  abn0St  methoc 
a  deterministic  pattern  we  will  have  to  use  more  ■ophhtfc^wtJS'  ^  d°  “0t  f°U°W  8Ucl 

2.2.2.  The  Regression  Method 

lineor  regression  model!°«  !!tT  prtdkUkdure'vsluel  of  the*”™'  * h  ‘“J*.  “  *  ,imple  ”  multiple 

squme.  mmmio.  fuuet"  °' th'  “nes  *  future  mines  oft  into  the 


*(*)  —  ®  +  bt  +  ccos  —  +  c(f), 


for  t  s  1, ..  n,  where  the  period  p  is  known,  we  can  estimate 
*,  and  c,  and  forecast  a  value  A  steps  into  the  future  by 


o.  4, 


and  c  by  their  least  squares  estimates 


a. 


i(n  +  A)  =  a  +  S(n  +  A)  +  ccos  MlL±Azi) 

P 
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The  simple  exponential  smoothing  method  consists  of  modeling  x(t)  as  a  weighted  average: 


x(t  +  l)  =  £$,*(*  + 1-j),  t>  1. 

i=i 


If  we  let  0j  =  a(l  -  ay'-1,  where  0  <  a  <  1;  that  is,  we  let  the  weights  decay  exponentially  to  zero,  then 
i  fij  -  1,  and  thus  for  large  t,  the  weights  will  approximately  sum  to  one.  We  can  choose  a  as  the  value 
of  a  minimizing 

S(a)  =  S  [*(*)  “ 

«=3  \  j=l 

and  then  forecast  the  value  at  time  n  +  1  by 


x(n  +  1)  =  ^2  “(1  “  ay  _1x(n  +  1  -  j). 
i- 1 


The  calculations  involved  in  this  process  can  be  greatly  reduced  by  noting  that 

x(t  +  1)  =  ax(t)  +  (1  -  a)x(t), 

where  i(l)  is  defined  to  be  x(l).  We  can  also  write  this  as  x(t  +  1)  —  (1  —  a)x(f)  =  ax(t),  which  is  called  a 
difference  equation  of  order  one. 

2.3.  Difference  Equations 

Difference  equations  are  very  important  in  the  study  of  time  series  analysis.  In  particular,  many  fore¬ 
casting  methods  can  be  thought  of  as  future  values  of  difference  equations.  Further,  many  probabilistic 
models  for  time  series  are  written  as  difference  equations. 

Definition.  let  r(  )  and  w(-)  be  sequences  of  real  numbers.  Then 

z(t)  +  Ox  z(t  -  1)  + - 1-  or pz(t  -p)~  w(t) 

is  called  a  difference  equation  of  order  p,  coefficients  alt...,ap,  and  forcing  term  u>(). 

To  calculate  the  values  of  z  for  all  values  of  t  it  is  sufficient  to  know  all  of  the  values  of  iv  and  any 
p  consecutive  values  of  z.  These  p  values  are  called  starting  values  or  initial  conditions.  If  we  know  the 
starting  values  z(l), ... ,z(p)  and  the  values  u;(p+  1),..., u»(r»)  for  a  difference  equation,  then  we  can  find 
z(p  +  1), . . . ,  z(n)  recursively  by 


r 

z(p  +  j)  =  w{p + j)  -  5^a»z(p  +  j  -  k), 


k=l 


j  =  — p. 


Note  that  w(p  + 1)  and  z(l), . . . ,  z(p)  are  used  to  find  z(p  +  1) ,  which  is  in  turn  used  in  finding  z(p + 2),  and 
so  on. 
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Ev(<-i)  =  i«(0 
>=0 
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p  p 

g(*)  -  H  ai **  tad  h(z)  =  ^2  Qj  Zp~’ 

“  Oil'd  «.  c »•«•**  »d  ^ 

- 

|z|  _  \/pT P-  „  greater  than  one  if  and  onl^if  the  ploUe^no^Tf1!!11*”  *T  a  +  bion  &n  **y  plane,  then 
centred  at  the  origin.  This  circle  is  called  the  unit  drcle  aSdwe  wm  T  1 1*  '*/*  °fradiua  «*  that  is 
outside,  or  on  the  unit  circle.  ’  d  *  wlU  henceforth  refer  to  zeros  being  inside, 
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In  this  lecture  we  consider  how  the  basic  ideas  of  probability  theory  apply  to  the  analysis  of  time  series 
nnd  we  discuss  some  of  the  traditional  time  series  models. 

3.1.  Introduction 

As  in  many  areas  of  statistics,  our  basic  aim  in  time  series  analysis  is  twofold:  descriptive  and  inferential. 
In  Lecture  1  we  considered  some  basic  ways  to  describe  time  series  data.  To  make  inferences  from  data  we 
will  use  the  following  strategy: 

1.  Assume  that  some  member  of  a  family  of  models  will  adequately  represent  the  observed  behavior  of  a 
time  series  data  set. 

2.  Identify  which  member  of  the  family  best  represents  the  data  (model  identification). 

3.  Estimate  the  parameters  of  the  chosen  model. 

4.  Check  the  adequacy  of  the  fit  of  the  estimated  model. 

5.  Make  statistical  and  scientific  inferences  based  on  the  characteristics  of  the  chosen  model. 

Note  that  we  will  also  make  inferences  occasionally  without  assuming  a  particular  model  for  the  data. 
We  will  call  these  nonpar  ametric  in  analogy  with  the  usual  nonparametric  analysis  in  the  random  sampling 
case. 

The  aim  of  this  lecture  is  to  introduce  both  the  quantities  that  we  want  to  make  inferences  about  (the 
correlogram,  spectral  density  function,  and  predictors)  and  the  models  that  are  usually  used  to  allow  us  to 
make  meaningful  inferences  about  these  quantities. 

We  visualize  that  a  data  set  *(l),...,*(n)  is  just  one  possible  set  out  of  many  that  could  have  been 
generated  by  some  random  mechanism  that  is  producing  data.  The  set  of  all  possible  realizations  that  could 
be  observed  is  called  the  ensemble  of  realizations. 


Definition.  A  time  series  is  an  indexed  collection  {X(<),t  €  T }  of  random  variables  having  finite  second 
moments;  that  is,  E(X2(t))  <  oo  for  each  element  of  the  index  set  T. 

Usually  we  will  consider  T  to  be  the  set  2  of  all  integers;  that  is,  we  will  assume  that  the  phenomenon 
being  observed  has  been  going  on  for  a  long  time  and  will  continue  indefinitely.  We  will  often  refer  to  a 
time  series  as  X(t)  or  just  X  if  there  is  no  possibility  of  confusion.  We  use  capital  letters  to  refer  to  random 
variables  and  lowercase  letters  for  particular  values  for  the  random  variables. 
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3.2.  Covariance  Stationary  Time  Series  E  T  3 


*  ^ee,eUely  bribed  by  .  Wfeige  rfit5 


mean 


m(0  =  E(*(0),  tzz 

K(*,  0  =  Cov(X(s),X(t))t  ,it€2 


made  m  the  time  aerie,  community  tow ard iSbSS. mL^T  “  desJ  of  ^rt  being 

not  satisfy  this  assumption.  P  *  thod*  for  •M*y«ng  data  from  processes  which  dl 


tiat  the  joint  distribution  at  X X(tn)  is  Llti^uZmal  ”  "***”  ^ 

Th'  mU"iV“i“'  PUy*  *  —  -  »  *•  thoo,,  Md  rf  ttoe  Mtra 

“  -a.  ^  „  . 

*M  =  C°y(X(g),X(t))  =  R(t  -  ,); 

'z£i?  *  -  —  ».  w. *  * 

type^utlnX!1^  S.“taX.  W"d  No‘o  a-  ‘tore  i,  mother  («,on|e,) 

The  Autocorrelation  Function 

KlWoo  /unction  ax  if  gjna'by'*  ““miry  time  tenet  btvief  .utocomifnce  function  R.  Tie  tetoeor- 

i>W  =  Corrpr(f), *(!  +  .)),  „<:Z 

=  . . M  is  celled  the  com**™  aX. 

Kobe  by  ft,  detail™  oftb,  amelu™  of,..  ,Mdom  ^ 

/*.)  =  -  <M*(t).*ft  +  ,W  _  »(„, 

Vvar(Jf(t))Var(JC(f  +  w))  ^(0)^(0)  ^(0)  ’ 
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.tnd  thus  p(0)  =  1  and  p(-v)  =  p(v).  Thus  the  autocorrelation  function  of  X  is  just  the  autocovariance 
function  divided  by  the  variance  R{ 0)  of  the  series. 

We  will  be  primarily  concerned  with  covariance  stationary  time  series  and  thus  will  be  trying  to  make 
inferences  about  p  and  the  autocovariance  function  R .  In  Theorem  3.2.1  we  summarize  some  of  the  basic 
f.trts  about  R ,  and  introduce  a  functions  that  is  mathematically  equivalent  to  R  and  are  of  importance  in 
\  lieir  own  right,  both  theoretically  and  in  practice. 


Theorem  3.2.1 


The  Spectral  Density  Function 


Let  {R(v),v  €  2}  be  the  autocovariance  function  of  the  covariance  stationary  time  series  X.  Then 

a)  R(v)  =  fl(-v),  v  €  2. 

b)  If  R  is  absolutely  summablet  that  is,  £^=-oo  |-R(v)|  <  oo,  then  there  exists  a  function  /(w),w  €  [0, 1], 
symmetric  about  w  =  .5,  called  the  spectral  density  function  of  X ,  such  that 

)cos2rvudui  v£2 


and 


f(u)' 

00 

f(u)  =  ^  R(v)  cos  2xvuft  u  €  [0, 1]. 

V3S-00 

c)  The  equations  relating  R  and  f  can  also  be  written  as 

R(v)=  I'  f(u)e2*ivudu> 

Jo 

00  OO 

/(w)=  ^  R{v)e-7rivu  =  R(Q)  +  2^2R(v)cos2wvu. 


V  =  1 


Implications:  We  are  assuming  that  all  inferences  about  a  covariance  stationary  time  series  X  can  be  based 
on  making  inferences  about  p  and  R .  The  theorem  provides  us  with  a  function  /  that  is  mathematically 
equivalent  to  R.  This  function  is  important  in  its  own  right  in  many  circumstances.  We  can  also  say  that 
one  possible  realization  {x(t),f  €  2}  can  be  thought  of  as  a  sum  of  infinitely  many  sinusoids  where  within 
a  particular  realization  the  amplitudes  of  these  sinusoids  are  fixed,  but  between  possible  realizations  they 
vary  according  to  a  probability  law.  These  sinusoids  are  called  frequency  components  and  we  can  think  of 
f(u)  as  being  proportional  to  the  average  value  (over  many  realizations)  of  the  squared  amplitudes  of  the 
frequency  component  of  frequency  u>. 


Ensemble  Mean  Interpretation  of  p  and  / 

Perhaps  the  most  useful  interpretation  of  p  and  /  is  as  ensemble  averages  of  the  sample  autocorrelation 
and  sample  spectral  density  functions.  It  can  be  shown  that  under  very  general  conditions, 

£(/(«))  /(")  E(p(v))  -►  p(v) 

as  the  length  of  the  sample  realization  goes  to  oo.  For  example,  if  we  are  interested  in  studying  the  EEG  of 
a  patient,  we  would  recognize  that  a  sample  time  series  consisting  of  an  EEG  record  would  vary  from  one 
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average  behavior  of  the  oboomsd  tbiTm*?  ”  "*j*‘  '"‘erestej  in  making  inferences  about  some 

“  t  °f  ‘he  -rf*  density  over  man, 

The  White  Noise  Process 

mean  of  0  and  a  constant  w^eT2.1”0*1*1 "  WhCD  X  cons,sts  rf  ““correlated  random  variables  having  a 

Definition.  A  time  series  X  is  ssid  to  be  a  whit*  BO,v. 

De  *  "hite  noise  process  with  variance  o1  if 

E(A-(t))  =  0,  <€2 

/2(v)  =  Cov(X(t)!Ar(t  +  tj))=:|^>  “  =  o 
Suci  a  process  is  denoted  by  X  ~  WN(«r5). 

If  we  define  the  Kronecker  delta  function 


«#0. 


=  v  =  0 
1 0,  v  5*  0, 

we  can  write  R(v)  =  6v<r*  when  X  ~  WN(r»).  Note  that 

OO 

/H=  £  i2(v)  cos  2xt»«  =  <rs  £  Cos  2xt,w  =  <r*; 

»S«QO  » 

spectrum  of  white  light,  a l^enw. “  Malo*y  with  the  physical 
2!  »  -rrea  to  a.  .bit,  no«. 


The  Theory  of  linear  Filters 


a,  Se^XTbT;,^^  pyro^'“  “™  *  w'  -H 

we  can  write  (as  ^^tl^^a^sqw^  WS”,0a  *"”*  ■”*“  X  with  fi/ter  coefficients  €  Z)  if 

y(0=  J2  ftxit-j),  tsz. 

i=-o©  « 


In  Lecture  2  we  defined  the  moving  average  smoother 
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m  hich  is  a  filtering  operation  with  coefficients 


2M  +  r 

0, 


til  <  M 
lit  >  M. 


Another  simple  example  of  filtering  a  time  series  is  a  moving  average  process  (as  opposed  to  the  moving 
average  smoother). 


Definition.  The  time  series  Y  is  called  a  moving  average  process  of  order  q  if 

HO  =  !><(*-*). 

fc= 0 

where  /?0  =  1  an d  c  ~  WN(cr2).  We  will  write  Y  ~  MA(g,/?,  a2)  to  denote  such  a  series.  We  will  also  often 
w rite  Y  ~  MA(q)  to  mean  that  Y  is  a  moving  average  process  of  order  q  without  concern  for  its  coefficients 
0  or  noise  variance  a2. 


We  have  seen  that  if  e  ~  WN(<r2),  then  Re(v)  =  6vtr7  and  f€(u>)  =  o 2  where  we  have  now  put  the 
subscript  e  on  R  and  /  to  indicate  to  which  time  series  they  correspond.  The  following  theorem  will  allow 
us  to  easily  find  Ry  and  fy  for  an  MA  process.  In  fact,  the  theorem  provides  expressions  for  Ry  and  fy 
(as  long  as  they  exist)  in  terms  of  Rx  and  fx  whenever  Y  is  a  filtered  version  of  X . 


Theorem  3.3.1 


Univariate  Filter  Theorem 


Suppose  that  X  is  a  covariance  stationary  time  series  with  autocovariance  function  Rx  and  spectral 
density  function  fx .  Suppose  that  Y  is  a  filtered  version  of  X  with  filter  coefficients  /?.  Then  assuming  that 
the  quantities  involved  exist,  we  have 


a )Y  is  also  covariance  stationary 


b)  The  autocovariance  function  ofY  is  given  by 


RY(v)=  £  Rff(k)Rx(v-k),  v€Z, 

fcs-oo 

where 

*/>(*)=  E  *€2. 

J=- oo 


c)  The  spectral  density  function  ofY  is  given  by 

fY(u)  =  |  *(eartw)lVx(«), 

where  the  function 

*(*)  =  £  zee, 


— oo 
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is  called  the  impulse  response  function  of  the  filter. 

We  note  that  h(e2wt“)  is  called  the  frequency  transfer  function  of  the  filter.  We  will  use  the  Filter 
Theorem  extensively  in  the  sequel,  particularly  in  the  next  section  where  we  introduce  time  series  models 
having  a  finite  number  of  parameters. 

R  and  /  for  an  MA(q) 

Let  y  ~  MA(q,0,<r3).  Then  part  (a)  of  the  Filter  Theorem  allows  us  to  immediately  conclude  that 
y  is  covariance  stationary  since  it  is  a  filtered  version  of  white  noise,  which  is  certainly  itself  covariance 
stationary.  Further  we  have 

/y(‘*')  =  'W’<w)|a.  *€[0,1], 

where  h(z)  =  £J_0  fiizk  is  a  qth-degree  complex  valued  polynomial.  We  also  have 

f 

/y(w)  =  Jiy(O)  +  2^fJy(t>)cos2xtt*>. 

vsl 

The  fact  that  ffy(t>)  =  0  for  |»|  >  q  is  an  important  characterization  of  an  MA(j)  process. 

The  Effect  of  Differencing 

The  Filter  Theorem  is  also  important  for  studying  the  effects  of  some  transformations.  For  example, 
we  can  see  the  effect  of  differencing  very  clearly.  Thus  suppose  X  is  a  covariance  stationary  time  series  with 
spectral  density  function  fx.  Suppose  Y  is  obtained  as  the  Ah  difference  of  X\  that  is,  Y(t)  =  X(t)-X{t-d). 
Then  Y  is  covariance  stationary  and 


/y(w)  =  |1  —  e3,w“|3/x(u’), 

since  h(z)  =  1  -  z4.  Thus  /y(w)  will  be  zero  anywhere  that  e,wiiu  is  one,  namely  at  any  u  such  that  du  is 
an  integer.  Thus 

i  =  0,l,...,d. 

In  particular,  first  differencing  makes  /y( 0)  =  /y(l)  =  0,  while  12th  differencing  makes  /y(0)  =  /y(^)  = 
...  =  /y(l)  =  0.  Thus  differencing  totally  removes  frequency  components  of  these  frequencies  from  a  time 
aeries.  In  fact,  any  differencing  makes  /y(0)  =  0. 

What  Does  the  MA  Smoother  Do? 


Another  example  of  using  the  Filter  Theorem  to  study  the  effect  of  transformations  is  to  consider  the 
moving  average  smoother.  Let  X  be  a  covariance  stationary  time  series  with  spectral  density  fx  and  let 


y<‘>=,  £  nrn*»-D- 


The  frequency  response  function  of  this  filter  is 


*^-nrn  hr*** 


where  the  function 


Dm(u>)  =  £  eUi’u 
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Dirichlet  Kernel  For  M=5 


Dirichlet  Kernel  For  M=20 


Dirichlet  Kernel  For  11^40 


Figure  3.1.  The  Dirichlet  Kernel. 


is  called  the  Dirichlet  kernel  and  is  well  known  in  many  scientific  areas.  We  have 


>)  = 


8in[(Af  +  |)2rw] 
sin  to; 


Figure  3.1  gives  graphs  of  Dm  for  M  =  5,  10,  20,  and  40.  Note  that  the  kernel  becomes  more  concentrated 
about  zero  as  M  increases.  Note  also  that  the  kernel  is  negative  for  certain  frequency  ranges  and  has  large 
“sidelobes that  is,  secondary  peaks.  Thus  we  have 


M«)  =  (2FTl)J|2?"(w)lVx(w)* 

and  Dm  becomes  more  and  more  concentrated  around  frequency  zero  as  M  (the  number  of  terms  on  each 
side  of  X(t)  used  in  the  average)  gets  large.  Thus  the  moving  average  smoother  is  essentially  allowing  only 
frequency  components  of  low  frequency  to  be  “passed”  to  Y  from  X . 


Definition.  IfY  is  a  filtered  version  of  X  with  frequency  transfer  function  h(e2wiw)t  then  the  filter  is  called: 

a)  a  low  (high)  pass  filter  if  only  low  (high)  frequency  components  are  passed  through  the  filter ,  that  is, 
if  /y(w)  =  0  for  w  >  u>\  (w  <  u)\)  for  some  frequency  wi. 

b)  a  bandpass  filter  if  only  frequency  components  in  a  certain  interval  (band)  of  frequencies  are  passed 
through  the  filter. 

Thus  the  moving  average  smoother  is  an  example  of  a  low  pass  filter  except  that  its  frequency  transfer 
function  never  becomes  exactly  zero  for  high  frequencies.  We  can  now  see  why  h(e2wtu)  is  called  the  frequency 
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^tfofUineCfiUer0f  th*  *  determbeS  what  happens  to  the  v"ious  frequency  components  in  as  a 
The  Lag  Operator 

Lmear  filters  can  be  succinctly  represented  if  we  introduce  what  is  called  the  lag  (or  backshift)  operator. 

Definition.  T be  lag  operator  L  operating  on  a  time  series  X  is  defined  by 

L'X(t)  =  X(t-k),  k€Z. 

If  Y  is  a  filtered  version  of  X  we  define  the  filter  polynomial  operator  to  be 

h(L)=  £  0kLk. 

*=-00 


Thus  we  can  write  formally 


y(0  = 


oo 

0kX(t-k)=  £  foLkX(t)  =  h(L)X(t). 

*=-00 


3.4.  Time  Series  Prediction 


Several  time  series  models  are 


In  this  section  we  describe  some  of  the  basic  theory  for  prediction 
presented  in  Section  3.5. 

.  *f.W*  ha7  a  *caIization  XM . X(n)  fr°m  *  t>me  series  X,  we  often  wish  to  find  a  function  of  the 

tdata^at  “  d06e  t°80m;  fr*ture  variable  X(n  +  h).  We  call  n  the  memory  or  origin  of  the  predictor  and  h 
the  horizon  or  number  of  steps  ahead  being  predicted.  * 


Definition.  LetXn  -  (X(l), . . .  ,X(n))T  be  a  realization  of  length  n  from  a  time  series  X. 

•> ^  rf/.(n + ^  x.  x„h  rfx.  tb»t  tit. 

mean  as  X{n  +  h)  and  has  smaller  prediction  error  variance  than  any  other  unbiased  function  ofXn. 

b)  The  best  unbiased  linear  predictor  ofX(n  +  h)  given  Xn  is  that  linear  function  Xnh  rf X*  that  has 
thytame  mean  as  *(n + h)  and  has  smaller  prediction  error  variance  than  any  other  unbiased  linear  function 

“jT  Tv? *  ”*  °°  *°  a  random  wiMe  tbeD  Xnh  »  called  the  infinite 

memory  h  step  ahead  predictor  of  X(n  +  h). 

d)  The  error  variances  ofXnh,  Xnh,  and  Xnh  are  denoted  c2nh,  &*hl  and  respectively. 

AsaiTk  k^wb^TT1*  T  8trai«htforw"d  ^Plication  of  material  on  prediction  of  random  vectors 
in  th^’detSlT  80  W  VC 1D  a  di8CU88ion  after  the  theorem  for  those  not  interested 
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Theorem  3.4.1 


Univariate  Prediction 


Let  X  be  a  zero  mean  time  series.  Then 

a)  The  best  unbiased  predictor  and  its  error  variance  are  given  by 

Xnk  =  E(X(n  +  h)\X(l),...,X(n)) 
}lh  =  \*x(X(n  +  h)\X(l),...,X(n)). 


b)  If  X  is  covariance  stationary  with  autocovariance  function  R,  then  the  best  unbiased  linear  predictor 
and  its  error  variance  are  given  by 

Xnh=\lhPnXn 

°lh  =  fl(0)  -  rZhK'tnh, 

% 

where  Pn  is  tie  n  x  n  matrix  having  zeros  except  on  its  main  reverse  diagonal  which  is  made  up  of  ones, 
Xn  =  (X(l),. ..  ,X(n))T,  and  the  prediction  coefBcients 

Anh  =  (Anfc(l),  •  •  •  t  Anh(tt)) 


satisfy  the  prediction  normal  equations 

EfiXnh  = 

where  Tn  =  Toepl(J?(0), . . . }R(n  -  1))  and  rnh  =  (JZ(h),...,  Ji(h  4 *  n  -  1))T.  Further ,  these  predictors 
and  prediction  error  variances  can  be  found  using  conditional  means  and  variances  as  in  part  (a)  but  for  a 
Gaussian  time  series  having  the  same  autocovariance  function  as  X. 

c)  Let  A n(j)  and  <r*  denote  the  coefBcients  and  prediction  error  variances  for  the  best  unbiased  linear 
one  step  ahead  predictor.  Then  the  A „(;)  and  b\  satisfy  Levinson’s  recursion; 

i  riu  W  +  i)-ELiAi(*W+ l-*) 

Aj+iU  +  1)  - - ^2 

A;+i(^)  =  A j{k)  —  Aj+i(j  +  1)A  j(j  +  1  —  i)»  h  =  1, . . .  ,j 

«J+i  =  *?(!- **+,(*  +  1)). 

with  Ai(l)  =  />(1)  and  d-g  =  £(0).  Further,  for  k>l,  Xk(k)  is  equal  to  the  correlation  between  the  errors  in 
predicting  X(t)  from  X(t  +  l),...,X(t  +  k-  1)  and  predicting  X(t  +  k)  from  X(t  +  l),...,X(t  +  k  -  1). 
Thus  X k(k)  is  called  the  partial  autocorrelation  coefficient  of  lag  k. 

d)  EX  is  covariance  stationary  and 

Wm  *ni  =  o-L  >  0, 

*1— *00 

then  X  is  said  to  be  purely  nondeterministic  and 

i)  There  exists  a  white  noise  time  series  e  having  variance  <r*,  and  an  infinite  sequence  of  constants 
7o  =  li  71,72,  •  •  •  suc&  that,  as  a  limit  in  mean  square, 

*(<)  =  H  7*  <(*-*)• 

1=0 
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u>  n‘  '»st,  ud  *  hr  „ .  (imi. 

00 

x^  =  ^2yk({n  +  h-k), 

h=h 


m  mean  square. 


wiile 


Further,  for  v>0, 


"2»  =  »iEr?. 


... = m „(v) . ,,  gTm>. 

i=o 


satisfying 


in  which  case 


J9  A  «<**“■  *»  JT  «°  <*  nmkttntmhUe  i 


jf  l°g f(u)du  >  — oo, 


i»  that  A"  have  a  spectra/  density  f 


*oo  =  <5. 


Woo,  Decomposition.  Av  cov„'„«  UMx^.  timK^Xcube  ^  „ 

•*(<)= m+v(o. 

-M-a  «*“.  —**»-**.,■*** 

variances,  whiie ptrt  00 ? ?  *"*  v*ri“c«*  «®  juat  conditional  means  and 
the  similarity  of  the  prediction  normal  equLb«  to  *e  n^Z  Ft?”  “d  thrir. erroi  variances.  Note 
playing  the  role  of  yTy,  r  playinc  the  rol*  JyTt  Ul*normal  equations  in  regression  analysis  with  Rid) 
O’)  P~vid«  tn  otio^d  Sit  '%££*."*«*»**  “»  «*<**>.  The 

OTJrsr  “dth~“£ Lred'^cr,y  ■ -  -sl,  ss 

Aow.  that  for  each  memory  n,  th“L »•  P«t  of  partTc) 
*"  “•*  “Jensively  in  identifying  time  series  models.  “  mde*d  *  P“U&1  corrcliti°n.  These  partials 

often  very  easy  kTelfcS.  P^a,^^teWord^iMAinfi,,ite  *“*"  Prcdictor«-  These  are 

innovations  h  often  need  in  other  contexts.  f£m  r w*^*?  °n  °f  *  proce“  “  terms  of  its 
diead  prediction  error  A(n+1)— A- 1  srfn+ll  _u;  v  •P?ft  W  *^at  the  infinite  memory  one  sten 

u  whd  is  left  over  after  having  used  thl  iifinitepasttf  X  aSVjTp •"T**5  **  f(n  +  1) 
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3.5.  Time  Series  Models 

In  this  section  we  present  some  of  the  models  that  have  been  used  to  represent  data. 

3.5.1.  Random  Walk  Processes 

The  first  process  we  discuss  in  this  section  is  in  fact  not  even  covariance  stationary,  but  occurs  very 
frequently  in  the  physical  and  economic  sciences. 


Definition.  Suppose 

X(f)  =  X(f-l)  +  c(0»  4  >  i. 

where  e  ~  WN(<rJ).  Then  X  is  called  a  random  walk  process  and  we  write  X  ~  RW(<r3). 

Actually  a  random  walk  process  is  not  fully  specified  until  the  characteristics  of  the  starting  value  X(0) 
are  given.  We  usually  assume  that  X(0)  is  a  random  variable  that  is  uncorrelated  with  any  of  the  t  a. 


Theorem  3.5.1 


Properties  of  Random  Walks 


Suppose  X  ~  RW(<r3)  with 

E(X(0))=/i;c,  Var(X(0))  =  <£,  Var(c(f))  = 


Cov(X(0),c(t))  =  0. 


Then 


E(X(f))  =  MX  Var(X(t))  =  +  ttrt>  *  -  1* 


Note  that  X  is  not  covariance  stationary  since  Var(X(t))  is  not  independent  oft.  Further,  Var(X(t))  -* 
oo  as  t  -♦  oo  Figure  3.2  gives  five  realizations  of  length  200  from  a  Gaussian  random  walk  process;  that 
is  X(0)  and  c(l), . . . ,  e(200)  are  iid  N(0, 1)  variables.  As  time  progresses,  the  realizations  get  increasingly 
far  apart.  This  is  expected  since  Var(X(f))  is  increasing  linearly  without  bound  as  t  increases.  Note  also 
that  these  realizations  are  similar  in  appearance  to  many  price  time  series  in  economics  such  as  stock  market 
data.  Finally,  note  that  the  first  difference  of  a  random  walk  process  is  a  white  noise  process. 

Prediction  of  Random  Walks 

If  X  ~  RW(<r3),  then  Xni  =  X(n). 

3.5.2.  Moving  Average  Processes 

In  our  discussion  of  linear  filters  above,  we  introduced  the  moving  average  process: 


X(t)  =  £  M*  -  *),  0o  =  l,  *  €  Z, 


k=0 


where  t  ~  WN(<r2),  and  showed  that  its  autocovariance  function  R  and  spectral  density  function  /  are  given 

by  ,  _.  . 

EUo  H  ^  ^  /(w)  =  ff3|fc(e3'-)|3,  u  €  [0, 1], 

M>« 


R(v) 


■i: 
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Figure  3.2.  Five  Realizations  from  a  Gaussian  Random  Walk  Process, 
where  the  complex  valued  polynomial  h  is  given  by 

*(*)  =  £/?»**. 

frsO 

If  we  write  the  process  in  operator  notation  as 


*(0  = 


ww  that  “*  c“ we  ~ite  1 


m  any  sense  c(t)  = 


Theorem  3.5.2  |  Invertibiutv  of  MA  Processes 

Suppose  X  -  MA (q,fite*).  If  the  zeros  ofh  are  a U  greater  than  one  in  modulus,  then 
•)f(")>0,  u  €  [0, 1]. 

3  4^=1  PitXj-i*  j>l. 


We  can  now  formalize  our  definition  of  an  invertible  MA 


process. 
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Definition.  An  MA(qi01  <r2)  process  is  said  to  be  invertible  if  all  of  the  zeros  of  its  characteristic  polynomial 
h  are  greater  than  one  in  modulus. 


Prediction  for  MA  Processes 
For  a  Gaussian  MA  process, 


f 

i 


X„k  =  E  (Af(n  +  h)\X„)  =  &E (£(n  +  h~  *)!*«)  =  °’  h>9. 


since  t(t)  is  independent  of  X{t)  for  t  >  a.  Thus  by  the  device  described  in  part  (b)  of  Theorem  3.4.1,  we 
can  write  the  following  recursion  for  the  best  linear  predictors: 


Xnh 


lo, 


h>q. 


Partial  Autocorrelations  for  MA  Processes 

In  Theorem  3.4.1  we  saw  that  the  partial  autocorrelation  of  lag  v  of  a  time  series  X  is  the  last  coefficient 
Av(t»)  in  the  best  t>  step  ahead  linear  predictor.  Note  that  unlike  the  ordinary  autocorrelations  which  become 
identically  zero  for  lags  greater  than  the  order  q,  the  partial  autocorrelations  of  an  MA  process  decay  to  zero 
exponentially. 

MA  Spectra  and  Trigonometric  Polynomials 

Before  leaving  MA  processes,  we  note  that  since  ft(«)  =  0  for  |v|  >  q,  we  can  write  from  Theorem  3.2.1, 

< 

/(w)  ss  JZ(0)  +  2  £  R(v)  cos  2xvu- 
vsl 

Now  we  can  also  write  cos  2irvu  as  a  polynomial  of  degree  v  in  cos  2 rw  since  for  v  >  2,  we  have  the  important 
trigonometric  identity 

cosv6  —  2cos0cos(v  — 1)0  —  eos(t>  —  2)0, 

which  if  used  recursively,  ultimately  expresses  cost>0  as  a  polynomial  in  cos0.  Thus  the  spectral  density 
an  MA(g)  can  be  written  as  a  gth-degree  trigonometric  polynomial.  The  above  identity  is  also  important 
other  contexts  in  time  series  analysis.  If  we  write  z(v)  =  cos»0,  we  have 

z(v)-2cos0z(t)-l)  +  z(»-2)  =  0,  t»>2, 

with  z(0)  =  1  and  z(l)  =  eos0,  which  is  a  second-order  difference  equation  with  initial  conditions  z(0)  and 
z(l).  This  equation  has  solution  z(t>)  =  cosw0. 

Examples  of  MA  Processes 

In  this  section  we  have  seen: 

1.  The  autocorrelation  function  for  an  MA(g)  process  is  identically  zero  for  lags  greater  than  the  order  q. 

2.  The  partial  autocorrelation  function  decays  to  zero  as  v  increases. 

3.  The  spectral  density  function  of  an  MA(g)  process  is  a  gth-degree  trigonometric  polynomial.  Thus  for 
small  values  of  g,  it  is  difficult  for  the  spectral  density  to  have  sharp  peaks. 
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length  200  from  eaeh  of  ,rKln>  deDsit>’'  *”<i  on'  eedizetion  of 

Model  1:  X(t)  =  t(t)  -  0-T0e(f — )  -  0.10e(f  —  2)  +  0.60e(f  -  3) 

Model  2:  *(*)  =  e(t)  +  0.80t(t  -  4); 

^io^^'st^  ^A!Lpsrr,d«“rf  t?'» r°d  “,*,(*ulb“‘)  ma<4>-  b  '*>*>> 

fo.  the  subset  model  me  «,„  except  f„,  Ug.  th.t  nSipta  *'/““*  »  “*•  »• 

3.5.3.  Autoregressive  Processes 

■n  autoregressive  process  of  order  p^th  wi&ients  Ud ^“torePe“ive  Intuitively,  X  is 

to  white  noise  by  a  filter  of  length  p;  that  is,  *’  •  •  • »  p  “d  noise  variance  a  if  it  can  be  transformed 

X(t)  +  aiX(t  -  1)  +  . . .  +  apX(t -P)  =  e(t)t  t  €  z 
,  -  WN(,»).  On.  epped  of  this  proeem  is  thnt  it  is  in  the  form  of  .  Kpatim  model 

*(0  =  -«1  x (t  -  1) - QpX(t  -  p)  +  c(t), 

with  p«t  (or  lagged”)  values  of*  as  the  independent  variables  in  the  region 

to .  1st:;::  r trtsir:*  t  defir  “ *  »*•  -  *•  ■— 

difficulty  with  this  is  in  dete’nninini  whether  luSTt  *futaaa  ?ho“  terms  •»  random  variables.  The 
»ch  »  e,udion.  To  ih^Z,  ££  ^ 

X(t)  +  aX(t-l)  =  ((t). 

Successively  substituting  for  X(t  - 1), X(t  -2),...  gives 

Jr-i 

x(o  -  J2  (-*y  <(*  -  i) = -(<*)*  x(t  -  k). 

i*  o 

“  M  <  1,  then  tnkmg  the  limit  of  the  ousted  «1„,  of  the  e,unre  of  both  ride,  of  this  remit,  h. 


m=Zw->). 

i=o 


where  ft  = 


Now  we  could  also  write  —X(t  + 1)  +  X(t\  —  ~r(t  j.  i\  «,v:.k  * 

a  '  '  '  af*  +  1)>  which  gives  upon  successive  substitution 

*(0  =  +  1)  +  «(t  +  1)) 

“  +  2)  +  <(*  +  2))  +  ((t  +  1)) 


=(-i),rJf«+^)+f:(-iy<«+2). 
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which,  if  |a|  >  1,  gives  as  a  limit  in  mean  square 


*(0  =  £>(*+i), 

;=i 


where  flj  —  (  1/oy.  If  |a|  s  1  we  are  not  able  to  do  this  process  as  neither  (— a)*  nor  (— 1/a)*  goes  to 
zero. 


These  arguments  can  be  extended  to  p  >  1  where  instead  of  the  location  of  |aj  determining  whether  to 
write  X(t)  as  a  function  of  past  and  present  or  future  e’s,  the  location  of  the  zeros  of  the  complex  valued 
polynomial 


*(*)= £«,**. 


i*  o 


or0  =  1, 


determines  the  representation.  The  following  theorem  summarizes  the  basic  results. 


Theorem  3.5.3 


Properties  of  Autoregressive  Processes 


Let  ao  —  1  and  oi, . . .  ,ap  be  *  set  of  real  constants,  and  define  the  complex  valued  polynomial  (called 
the  characteristic  polynomial  of  the  stochastic  difference  equation)  g(z)  =  Let  t  ^  WN(<ra). 

a)  If  none  of  the  zeros  of  g  are  equal  to  one  in  modulus,  then 

*(0=  £  PAt-j),  t€Z, 
i*- 00 

exists  as  a  limit  in  mean  square  where  the  0’s  are  the  coefficients  of  the  polynomial 

This  representation  is  called  the  (doubly)  infinite  order  moving  average  representation  of  X.  Further,  X 
satisfies  the  autoregressive  difference  equation,  is  covariance  stationary,  has  spectral  density 


.f{u)  =  o* 

and  has  autocovariance  function  R  satisfying 


1 


|j(e2fi*)|2 


=  <rJ|A(e3*<“)|3, 


£  °iR 0"  -®)  =  **0—,  V  €  Z. 

i=o 


b)  If  the  zeros  of  g  are  all  less  than  one  in  modulus,  then  0j  =  0  for  j  >  0;  that  is,  X(t)  can  be  expressed 
as  a  function  of  els  at  only  future  times. 

c)  If  the  zeros  of  g  are  all  greater  than  one  in  modulus,  then  0j  =  0  for  j  <  0;  that  is,  X(t)  can  be 
expressed  as  a  function  of  c ’s  at  only  the  present  and  past  times.  Thus  the  equations  relating  the  R  and  the 
ah  can  be  written  as 

p 

£  °i  ~  v)  - 1**1’  v  >  o, 

i=o 
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which  are  called  the  Yule-Walker  equations. 

From  this  theorem  we  can  see  that  as  long  as  none  of  the  zeros  of  g  are  on  the  unit  circle  (that  is,  equal 
to  one  in  modulus),  the  infinite  order  moving  average  process  provides  a  covariance  stationary  solution  to 
the  stochastic  difference  equation  and  has  the  properties  given  by  the  theorem.  If  all  of  the  zeros  of  g  are 
outside  the  unit  circle,  then  a  stationary  solution  can  be  explicitly  defined. 

Since  in  practice  we  do  not  observe  the  future,  we  would  like  to  define  an  autoregressive  process  so  that 
X(t)  is  only  a  function  of  (c(s),  s  <  f }.  We  formalize  thid  in  the  following  definition. 


Definition.  A  time  series  X  is  said  to  be  an  autoregressive  process  of  order  p  with  coefficients  a  - 
(oi, . . . ,  ctp)T  and  noise  variance  o3  ifX  satisfies  the  stochastic  difference  equation 

X(t)  +  oi*(t  -  1)  +  *  •  ■  +  op  X(t-p)  =  c(t) 

where  the  zeros  ofg(z)  =  2^=0a;*J  are  ail  greater  than  one  in  modulus.  We  denote  such  a  time  series  by 
X  ~  AR(p,  or,  <r2). 

Recall  for  an  MA(g)  process  that  R(v)  =  0  for  |tr|  >  q;  that  is,  as  soon  as  the  lag  v  gets  larger  than  the 
order  of  the  process,  X(t)  and  X(t+v)  are  no  longer  correlated.  The  autoregressive  process  has  a  correlation 
function  that  often  gives  a  more  realistic  description  of  how  X(t)  and  X(t  +  v)  become  uncorrelated  as  v 
increases.  We  will  illustrate  this  by  considering  the  AR(1)  process.  For  an  AR(1)  process  having  coefficient 
a  and  error  variance  <rJ,  we  can  write  the  Yule-Walker  equations  for  v  =  0  and  t>  =  1  as 

R(0)  +  aR(l)  =  <r2 

R(l)  +  aff(0)=0, 

since  fl(l)  =  R(- 1).  From  the  second  equation  we  have  R(l)  =  -oR(0),  which  when  substituted  into  the 
first  equation  gives 


From  the  Yule- Walker  equations  for  v  >  0,  we  have 

R(v)  =  -o  R{v  -  1)  =  (-a)sR(v  -  2)  =  •  •  •  =  (-o)-R(O), 


which  gives  _  ,  , 

Finally,  since  iE(v)  as  -R(— v),  we  have 

R(t>)  =  (~a_)l^,  p(»)  =  (-o)W,  u€2. 

Note  that  a  sequence  {o(v),  v  >  0}  is  said  to  decay  exponentially  to  zero  if  we  can  write 

|a(«;)|  <  cF, 


where  \6\  <  1.  Thus  the  autocovariance  function  (and  hence  the  autocorrelation  function)  of  an  AR(1) 
process  decays  exponentially  to  zero.  The  rate  of  decay  depends  on  how  close  or  is  to  one. 
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Theorem  3.5.4 


Exponential  Decay  of  AR  Correlations 


cf,U)P nJ  ~  ARiP'“y)  “d  fe*  *  *  *«*»  ofX.ut!l . 1)l5 

l^(*')l  <  *  (max 

,  V  i  M 

tor  some  constant  *  >  0. 

faeet^fZ.^  tl“i'  "*'"“*"*  *’■<  «■« 

Prediction  for  AR  Processes 

Prediction  «  .ery  etaple  fc,  AR  protege,  e,  »e  c,„  write  To,  ,  G.o«i„  AR(p)  p,oce» 

*"*  ~  EW"  +  *)l*o)  =  ~£o, 

J=1 

Thus  the  AR  predictors  satisfy  the  homogeneous  difference  equation 


ajXn,h—j  —  o. 

i=o 


Further 


Var  (xn,h  -  A-(n  +  A))  = 

1=0 


-ke«  the  S’t  ere  the  coefficient,  of  the  MA(oo)  repre^otetion  of  X. 

pro«.“Ln£  to  f«r„X01 tTe^r^To^  t »«  hte'S  “T"',,tk>M  *  “  A*« 

Examples  of  AR.  Processes 

We  have  now  seen  that  for  an  AR(p)  process: 

SMr^°^?iiiJee1th^^tofifte*ihl»tlrC'r,“*tl0“  f  (““•P  th'  Yule-Welker  Eque- 

«orJ*rem  ,>0,,"0n,i,1  *"  *"  »”“<*'  »f  *•>«  cirrle,  the 

2.  The  pertM  MltocorrelatioQ  fnnction  ie  identical,  zero  for  lep  peeler  then  p. 

3  ^.btJhe^t^tnr.'*1  °f  *  Plh  d'P“  MynomUl.  Thi.  meen.  the,  it  , 

»  £  S:r^«T^AR“<,!'“'  ^  ,rtCtnl  *"**  -i  *  »«&«io.  of  length 

Model  1:  X(t)  -  0.80*(<  -  1)  +  0.40*(t  -  2)  =  c(t) 

Model  2:  X (t)  -  0.90 X(t  -1)  +  0.70X(<  -  2)  =  <(*). 
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Note  how  the  qualitative  features  of  the  data  correspond  to  the  spectra  and  correlations.  In  particular, 
notice  that  the  correlograms  are  of  two  types.  In  Series  1,  the  correlogram  decays  relatively  rapidly  to  0, 
while  in  Series  2,  the  correlogram  follows  a  sinusoidal  decay. 

3.5.4.  Autoregressive-Moving  Average  Processes 

In  the  previous  two  sections  we  studied  MA  and  AR  processes.  In  this  section  we  consider  the 
autoregressive-moving  average  (ARMA)  process  which  in  a  variety  of  ways  is  a  combination  of  the  AR 
and  MA  processes. 


Definition.  A  time  series  X  is  said  to  be  an  autoregressive-moving  average  process  of  orders  p  and  q, 
coefficients  a  =  (au ....  ar)T  and  $  =  (ft, .... 0,)T,  and  noise  variance  o 7  if  X  satisfies 

jX(t-j)  =  220te(t-k),  teZ, 

j=o  t-o 

where  c  ~  WN(<rJ)  and  the  zeros  of  the  complex  polynomial  g(z)  =  a jz*  are  all  outside  the  unit  circle. 

We  denote  such  a  time  series  by 

X~ARMA  (p,q,a,0,<T3). 

If  the  zeros  of  the  complex  polynomial  h(z)  =  o  A  z*  ue  outs,de  tie  un,t  ciTcIe>  tben  m  say  ^at  X 
is  invertible. 


The  operators  g(L)  and  h{L)  are  called  the  AR  and  MA  operators,  respectively.  Using  arguments 
similar  to  those  given  in  the  previous  two  sections,  we  can  show  that  an  ARMA  process  has  the  MA(oo) 
representation  ^ 

*(0  =  -  *)• 

fcsO 


where  the  7t’s  are  the  coefficients  of  the  polynomial  h(z)/g(z),  while  if  X  is  invertible,  it  has  the  AR(oo) 
representation 

f^ajX(t-j)  =  c(t),  tez, 


i= o 


where  the  aj ’s  are  the  coefficients  of  the  polynomial  g(z)/h(z).  Thus  we  are  able  to  justify  the  idea  of  writing 
g(L)X(t)  =  h(L)c(t)as 

-  —At) 


X(t)  = 


9(L) 


or 


In  the  next  theorem,  we  express 
terms  of  its  parameters. 


9_W 

h(L) 


X(t)  =  c(t). 


the  autocovariance  and  spectral  density  functions  of  an  ARMA  process  in 


Theorem  3.5.6 


R AND  /  FOR 


an  ARMA  Process 


X  ~  ARMA(p,  <r2), 


then 
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Table  3.1.  Nature  of  R  and  /  for  ARMA  Processes 


Process 

R(t>)  or  p{v) 

/(«) 

MA(q) 

0  for  |v|  >  q 

q th-degree  trigonometric 
polynomial 

AR(p) 

exponential  decay 
for  v  >  0 

reciprocal  of  pth-degree 
trigonometric  polynomial 

ARMA(p.g) 

exponential  decay 
for  v>  q 

(g,p) th-degree  rational 
trigonometric  polynomial 

•)  f(u)  ~ 


|h(ca*,u,)|3 


b)  The  autocovariances  R  satisfy 


a jR(j  -  w)  =  53  PkRxt(v  -  *)’  w  -  0 

;=o 


k=0 

=  0,  v  >  q, 


where  the  cross-covariances  -Rxc  are  given  by 

f°, 

feXe(tO  =  Cav(X(f),c(t +  *))  =  < 


e  >  0 
«  <  0 


and  the  7„ ’s  are  the  coefficients  of  the  MA(oo)  representation  of  X. 


Thus  J?  satisfies  a  homogeneous  difference  equation  of  order  p  for  v  >  q.  The  equations  for  v 

q  +  l . ,  +  p  are  often  called  the  high  order  Yule-Walker  equations.  We  can  now  summarize  the  nature  of 

R  (or  p)  and  /  for  AR,  MA,  and  ARMA  processes  as  in  Table  3.1. 

Examples  of  ARMA  Processes 

As  we  have  seen,  the  ARMA  process  is  a  combination  of  the  AR  and  MA  processes.  The  basic  charac- 
teristics  of  an  ARMA  model  are: 

1.  Neither  the  correlogram  nor  the  partial  correlogram  is  identically  zero  past  some  lag;  rather  they  each 
decay  exponentially  past  some  lag. 

2.  The  spectral  density  function  is  the  ratio  of  a  gth-degree  trigonometric  polynomial  to  a  pth-degree 
trigonometric  polynomial.  Thus  it  can  have  sharp  peaks  and/or  sharp  troughs. 

In  Figure  3.5  we  give  the  correlogram,  partial  correlogram,  spectral  density,  and  a  realization  of  length 
200  from  each  of  the  two  ARMA  models: 

1 :  *(t)  -  1.20X(t  -  1)  +  0.78X(t  -  2)  =  c(t)  +  0.45c(t  - 1)  +  0.39e(f  -  2) 

2  :  X(t)  -  0.90X(<  -  1)  =  e(t)  +  0.80c(t  -  4). 

That  is  Model  1  is  an  ARMA(2,2),  while  Model  2  is  an  ARMA(1,4)  where  the  first  three  MA  coefficients 
are  zero.  In  each  of  these  models,  both  the  correlogram  and  the  partial  correlogram  exhibit  an  exponential 
decay,  thus  ruling  out  a  pure  AR  or  pure  MA  model. 
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LECT.  3 

3.5.5.  Subset  and  Multiplicative  Subset  ARMA  Processes 

A  natural  extension  of  the  ARMA  model  is  the  case  where  only  a  few  of  its  coefficients  are  nonzero; 
that  is,  the  subset  ARMA  models  which  we  write  as 

*(0  =  c(0i 

where  Ul  <  •  •  •  <  ti,>  =  p  and  t>i  <  •  •  •  <  vq  =  q  are  called  the  AR  and  MA  lags,  respectively.  We  can  go 
a  step  further  and  define  a  multiplicative  subset  ARMA  process  as  a  combination  of  a  “full”  ARMA  and 
subset  ARMA  by 

Such  models  will  be  important  later.  Note  that  as  long  as  the  zeros  of  each  of  the  four  polynomials  involved 
in  the  model  are  all  outside  the  unit  circle,  then  we  can  express  the  multiplicative  subset  ARMA  model  as 
an  ordinary  ARMA  model  of  orders  p  +  tip  and  q  +  vq,  and  use  all  of  the  results  that  we  have  obtained  for 
ARMA  processes. 

3.5.6.  ARIMA  Processes 

Many  of  the  processes  in  economics  and  business  are  nonstationary  but  can  be  transformed  to  station- 
arity  by  differencing.  This  fact  has  led  to  the  popularity  of  the  so-called  Box- Jenkins  method  of  time  series 
analysis.  The  method  employs  extensively  what  is  called  an  ARIMA  model. 


Definition.  A  time  series  X  is  called  an  autoregressive  integrated  moving  average  process  of  orders  (p,  d,  q) 
if  the  series  Z(t)  =  (1  -  L)dX(t)  is  an  ARMA(p,«)  process.  We  denote  such  a  process  by 

X  ~  ARIMA(p,  d,q,a,  0, <ra). 


The  simplest  example  of  such  a  model  is  the  random  walk  process  where  d  =  1  and  p  ss  q  =  0.  Note 
that  if  we  let  g  and  h  denote  the  AR  and  MA  operators  for  the  ARMA  part  of  the  model,  then  we  have  that 
X  satisfies 

g(L)(l  -  L)dX(t)  =  h(L)t(t). 


Multiplicative  Subset  Seasonal  ARIMA  Models 
A  general  model  of  tbe  form 

g(L)G{L)(l  -  L)d(l  -  Ls)°X(t)  =  h(L)H(L)c(t), 

where  g,G,h,  and  H  are  the  four  operators  in  the  multiplicative  subset  ARMA  model,  provides  a  general 
framework  within  which  many  seasonal  time  series  can  be  analyzed.  If  the  AR  and  MA  lags  are  multiples 
of  the  seasonality  factor  S,  then  this  model  is  the  one  which  is  used  in  the  Box-Jenkins  forecasting  method. 
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•M?  ***  **—  *•<  -  w  n»d,  am  mmu. 

“  lecture  1.  Id  Section  Meg  describe  the  bre£  ofthM.ro  tee'te  feTSit'  th*t  wt"  i«rodoced 

Section*  4.3  and  4.4  we  consider  *k  ‘  °  testa  fot  white  noise  we  used  in  Lecture  1.  In 

and  identifying  the  orders  of  the  model.  *  *  9  °M  °f  “tlmati“8  ‘he  parameters  of  an  ARMA  model 


4.1.  Sampling  Properties  of  Descriptive  Statistics 

In  Lecture  1  we  introduced  the  statistics 

*=;£*«> 


*•>  *  :  £  WO  -  *K*«  + 1«|)  -  X),  H  <  „ 


M.m, 


M  <  n 


/(u,)  =  i|  j2(X(t)  - 


<‘'€[0,1], 


“n^oSett r  “t UPPW  ““  ktte”  to  «me 

them  primarily  «  *ts  ofnu^ew  d0m  VmM*’  Where“  “  Uct™  1  were  treating 

asymptotic  distribution)  tfX XT “mplin*  propertie8  (mean.  variance,  and 

>-•'  **  4-  - 
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4.1.1.  The  Sample  Mean 


Theorem  4.1.1  Sampling  Properties  of  X 

Let  X  be  a  covariance  stationary  time  series  with  mean  ft  and  autocovariance  function  R.  Let  X„  = 
(X(l), . . .  ,X(n))T  be  a  realization  of  length  n  from  X  and  let  X„  —  £  J2"=1  X(t).  Then 

a) E(Xn)  =  ft. 

b)  Var(X„)  -*  0  as  n  -»  oo  if  and  only  ifCov(X(n),X„)  — ►  0  as  n  — ►  oo,  a  sufficient  condition  for  which 
is  that  R(n)  -*•  0  as  n  -*  oo. 

c)  If  £^-00  R(v)  <  °°>  tien  linVi-oo  nVar(X)  =  ESl-oo  R(v)> 

d)  Under  mild  assumptions,  y/n(X„  —  ft)  N( 0,  /( 0)). 

Implications:  Part  (a)  says  that  Xn  is  an  unbiased  estimator  of  ft.  A  sufficient  condition  for  an  estimator 
to  be  consistent  is  that  it  is  asymptotically  unbiased  (that  is,  has  expectation  converging  to  the  parameter 
being  estimated)  and  has  variance  converging  to  zero.  Thus  part  (b)  gives  a  condition  for  Xn  to  be  consistent. 
A  time  series  is  also  said  to  be  ergodic  in  the  mean  if  Var(X„)  — *  0  as  n  — ♦  oo,  that  is,  if  the  average  Xn 
over  tim»  of  a  single  realization  approaches  the  average  ft  —  E(X(f))  at  a  single  time  t  of  the  ensemble  of  all 
possible  realizations.  Thus  X  is  ergodic  in  the  mean  if  and  only  if  as  n  gets  large,  adding  new  observations 
in  the  calculation  of  Xn  has  no  effect  on  it  in  the  sense  that  Cov(X(n),  Xn)  — ►  0. 

Part  (c)  gives  two  expressions  that  can  be  used  to  approximate  Var(X„)  and  also  shows  that  Var(X„) 
goes  to  zero  at  the  rate  of  1/n.  Finally,  part  (d)  provides  a  general  Central  Limit  Theorem  for  Xn.  This 
gives 

Xn  ±  Za/i 

as  a  100(1— o)%  confidence  interval  for  ft.  We  will  see  in  Lecture  6  that  under  the  conditions  of  part  (d),  we 
can  find  an  estimator  /(0)  such  that 

is  also  a  100(l-a)%  large  sample  confidence  interval  for  ft. 


Equivalent  Number  of  Uncorrelated  Observations 

To  illustrate  the  results  of  Theorem  4.1.1,  suppose  that  X  is  an  AR(1)  process  with  coefficient  a,  noise 
variance  <r3,  and  mean  ft.  Then 

„m  J*c«0a-e*>  _  ™i+> 

Thus  a  95%  confidence  interval  for  ft  is  given  by 


Xn  ±  1.96a 


!m±±p 

n  1  —  p' 


This  expression  allows  us  to  introduce  the  idea  of  an  equivalent  number  of  uncorrelated  observations.  Recall 
that  a  95%  confidence  interval  for  the  mean  of  a  population  having  variance  R( 0)  based  on  a  random  sample 
of  size  N  is  given  by 
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- ...  .w/u  ivn 

Thus  to  make  sure  that  th#  mndA  •  *  .  lect.  4 

**"“  width,  .«  would  „„d  “  "™  f'°">  the  AR(1)  process  Md  lbt  ^  ^ 

N  =  n^~E. 

^£SS£SS5S 

de'n“ty7  XoutlX  «"  *LV “  “AR<I>  weLutfau  thun*^ 


jy  _  wf?(0)  _  n 

~  /(0)  • 

2  />(*) 

•=-00 

corresponding  rundotn^fctX*1'0"  °f  "  ‘he  time  *"“  '**  “  W*  (mow)  nccurst.  fo,  ^ 
d.l.J.  The  Sample  Autocovnrianceg  and  Autocorrelation* 

In  Lecture  1  we  toted  thnt  the  tnmple  uutocovnrisnce  function 

^WO-JIW+M-AX  W<„, 

•SvTjPil" “ta*to' °f ‘he •■‘oeevvU.c  Win. «.  n, „ ^ 


•0  called  because  if  we  knew  u  and  used  it  rn.t.  a  ~r  v  ■  , 

’k^“5V)  J» f  *  Xldh.Tl£-M/“ <‘'finiti0" »f  *.  ‘he  result  would  h, u»hin«d 

1™^“.  *r  ‘  —■  -eu.i»«  iSE  ’SX'1’  Sr  f «  «'««  *  Xj 

q  err0t  rf  *(«)  “  «maDer  than  that  of  &(v)  rthei)  rt  “  believed  that 


"e°Km  ,  l  *  ]  Properties  op  A  and  , 
Diufe/  mild  assumptions, 

‘)  MW  -  JK,,)  =  -  M*w 
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b)  We  have 

lim  nCov(p(h),p(g)) 


=  [^r  +  +  h)  +  P(r  “  9)p(r  +  A)  -  2 p{h)p(r)p{r  +  y) 

r=-oo 

-  2 p(g)p(r)p(r  +  h)  +  2p(y)p(h)p2(r)] 

2  y* 

=  [cos  2irhu  -  p(h)}  [cos  2ir gu  -  p(g)]f7(u)du. 

c)  Any  finite  collection  of  R’s  or  p’s  is  asymptotically  multivariate  normal. 

d)  We  have 


Km  nVar(.R(t>))  =  53  [^2(r)  +  *(r  ~  v)Rir  +  »)]  =  2  J  c0*2  2xvuf7(u)du, 

n^°°  r=-oo 


while 

^lim^  nVar(p(t>))  =  53  [^(r)  +  P(r  ~  vMr  +  v)“  *P(v)p(r)p(r  +  v)  +  2Pa(»)P*(r)] 

r=— oo 

=  [cos 2irvw  -  pW] 2/*(w)dw- 

e)  IfX  is  a  white  noise  series  with  variance  R( 0)  =  er3,  we  have,  approximately  for  large  samples,  that 
the  elements  of  tiie  covariance  and  correlation  sequences  are  uncorrelated,  and  for  v  >  0, 

_4  1 

R{v)~N{  0,-),  p{v)~N{  0,-), 

n  n 

while  R( 0)  ~  N(<r7, 2cr4/n). 


4.1.3.  The  Sample  Spectral  Density  Function 

The  next  theorem  verifies  our  observation  in  Lecture  1  that  the  sample  spectral  density  function  is  too 
oscillatory  to  be  useful  for  making  statistical  inferences. 


Theorem  4.1.3 


Properties  of  the  Periodogram 


Under  mild  assumptions, 

a)  The  periodogram  is  unbiased. 

b)  We  have 

lim  Var(/(i 


“0)=  j 


A"). 

2 /2(w), 


u  £  0,  .5 
w  =  0,  .5, 
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BlimCov(/(Wl)1/(W2))  =  0, 


c)  For  May  integer  M>land  fixed  fieq 


uencies  «, . w*  in  (0,.5),  we  have  that  the  random  variables 

thou 1  2/W) 

f(ui)  ’  f(uu) 


converge  in  distribution  to  the  W  i.li>u.j  .  , 

distribution  with  two  degrees  of  freedom.  ”  ^  °m  VMt,*b,e*  each  having  the  chi-square 

d)  If  X  is  a  normal  white  noi«e  nmr.r.  _:.l  • 

*»/««  <*  the  sample  spectral  density  at  the  naZdfirTaZnci’J^  i!l*petiof0St*m  ordinates,  that  is,  the 

frequenoes,  are  independent  ’  j  ’  e 


<rJ  ““ 


ui  ^  0,  .5 


<f2  Al» 


"i  =  0,  .5. 


** *•  period 

P-t  (b)  also  of uT  W  “°W  init 

different  frequencies.  Thu s  the  typttirionmnoen  Z* t  ***$  "*  “^P*0^  uncorrelated  at 
level  will  change  with  the  level  of  the  true  nectn/d^t  *£*  *  "^te  no“«  •«>«•  itself  except  that  its 

- r ' — -  -  -  *  -  -  *  iv “£=■, 

4.2.  Tests  for  White  Noise 

u.«  t^i"‘ whtAt’ *  -  «.  „Bdud. 

4.2.1.  Bartlett’s  Test 

/(«*>  s  SzdW  .  . 

23», /(«,)’  *  =  1 . + 

of  the  data  should  be  close  to  the  points 


In  fact  it  is  easy  to  show  that 


^p,fe,V?|s.-i|<.)  =  £  (_,y.-,.v  =  cw 

00 
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to  judge  whether  a  cumulative  periodogram  has  a  maximum  deviation  from  the  expected  straight  line  too 
extreme  to  be  reasonable  under  the  hypothesis  of  white  noise. 

4.2.2.  The  Portmanteau  Test 

Bartlett’s  test  is  done  in  the  frequency  domain  and  uses  the  fact  that  the  spectral  density  of  white 
noise  is  constant.  Another  approach  to  the  problem  of  testing  for  white  noise  is  to  use  the  fact  that  its 
autocorrelation  coefficients  for  nonzero  lag  are  all  zero.  The  most  popular  method  of  this  type  is  often  called 
the  portmanteau  or  Q  test.  If  X(l) . X(n)  is  a  sample  realization  from  a  white  noise  process,  then  the 

etatistic  _ 


Q  =  n(n  + 2)  Xm- 


Thus  the  hypothesis  of  white  noise  is  rejected  if  Q  >  xl,m-  Note  that  this  test  is  often  applied  to  the 
residuals  (one  step  ahead  forecast  errors)  from  a  model  that  has  been  fit  to  a  data  set.  In  this  case,  one 
degree  of  freedom  is  subtracted  from  m  for  each  parameter  that  has  been  estimated. 

The  Choice  of  m  in  the  Q  Test 

One  difficulty  in  using  the  Q  test  is  in  deciding  what  value  of.  to  use.  Asymptotically  this  choice 
shouldn’t  matter,  but  to  investigate  its  effect  in  small  and  moderate  samples,  we  give  m  Table  4.1  the  results 
of  a  simple  simulation  study.  For  samples  of  size  50,  100,  and  200,  we  generated  500  Gaussian  white  noise 
series  and  counted  how  many  times  the  null  hypothesis  of  white  noise  was  rejected  for  a  -.01,  .05,  and  10 
for  .=5  10,  and  20.  For  each  sample  size,  these  should  be  5,  25,  and  50  for  the  respective  values  of  a.  For 
each  a,  we  also  counted  the  number  of  times  out  of  500  that  the  conclusion  was  not  the  same  for  the  three 
values  of  a.  Rom  these  results,  it  appears  that  (1)  the  true  Type  I  error  probability  of  the  Q  test  tends  to 
increase  with  m,  particularly  for  smaller  sample  sizes,  and  (2)  the  effect  of  choosing  m  is  more  important  as 

a  increases. 

Table  4.1.  Results  of  a  Simulation  Study  of  the  Effect  of  the  Choice  of  m  in  the  Q  Test 


alpha*. 01 

*■50  , **100  **200 
**S  8  0  6 

**10  14  8  4 

**20  18  14  • 

•  diff .  21  20  11 


alpha*.  05 

**50  **100  **200 
31  27  27 

31  28  25 

46  43  23 

57  48  45 


alpha*. 10 
**50  **100  **200 
48  52  35 

56  58  54 

72  70  63 

81  82  74 


4.3.  Estimating  the  Parameters  of  ARMA  Models 

Given  date  X(l) . X(n)  from  an  ARMA  process  of  orders  (p,g),  we  would  like  to  find  estimates  a, 

0  and  of  the  parameters  of  the  model  as  this  would  allow  us  to  estimate  /  by 

E&e’^ 

/(«)  =  - 

i=° 

and  we  could  substitute  the  parameter  estimates  into  the  formulas  for  forecasting  future  values  of  X.  In  this 
section  we  assume  that  the  orders  p  and  q  are  known  and  discuss  the  estimation  of  the  parameters.  In  the 
next  section  we  consider  the  problem  of  determining  p  and  }■ 
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The  first  method  that  we  eon«i<t»  .l 

w,th  *  h“  -  — «M  •  MM 

S£ir“  — -  ,m  -m.  *,  m  M  MnuiMJ  „  tte 


W  J  Pmm™  -  MLB's  to.  Time  See.es 

/  "bid  depends  iGl“S  “  ^Issspectnldensitr 

MXn'  TAea  UD*r  «*•*  re*u/ar/[y  cooAW  ^  ^  l”“in,um  /ilreWAood  estimator  of# 


where  !(#)  i,  cal/ed  the  information 


\ft(Sn-i)-±+Nr(o„i-i 


(*)), 


matrix  of#  and  has  fohjth  element 
//*(#)  si/1  *jpt/(w)#log/foi 

2  Jo  dOj  Qffk  j,  k  sr  1, . . .  t  r. 

Asymptotic  Distribution  of  ARMA  MLE’s 
Tbs  me,  MubI.  TWem  S  3  ,  to  ARMA 

[!EEE±E]  Distribution  or  ARMA  MLE’s 

•j  The  information  matrix  of  9  is  *Ven  by 


!(#)« 


1“'  0, 
I”  0f 


2<H 


[Symmetric 

where  fori.*  =  l . .  and  /, m  =  1 . . .  *,  have 

W  =  Rr(\j  -  k\) 
tfm  =  Rz(  |/  —  m|) 

/l  e’^W-'V 

il  ~  Jo  #(•**•«)*(«- 
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b)  If  q  =  0,  then  I"1  (a)  =  S(o),  where  S(a)  is  the  Schur  matrix  corresponding  to  or,  and  thus 


c)  If  p  =  0,  then  where  S(fi)  is  the  Schur  matrix  corresponding  to  fi,  and  thus 


4.3.2.  Approximate  MLE’s 

Box  and  Jenkins  show  that  if  X  is  an  invertible  ARMA  process  with  a  Gaussian  error  term  «,  then  the 
likelihood  function  of  or,  0,  and  a7  for  a  realization  X  of  length  n  can  be  expresse  as 

L(ajy |X)  =  (2*<rs)-"'J|AfB(a,0)|1/3exp  > 

where  Af„  is  a  matrix  that  is  a  function  of  a  and  fi,  and 


S{aJ)=  £ 


■th  ms  _  EfcmiXl  and  the  conditional  expectation  is  calculated  assuming  that  a  and  fi  are  the  true 

ft. ARMA  U-Sl  To  «»d  the  l,  f.  ml  o’  *.-**•  »*■+  *7 

start  tat  that  the  rttm  |M„(a,«|  cut  be  ignortd,  md  eeeond  that  the  nimmat'oo  to  Sja.^  caa  he 
truncated  below  at  some  Unite  limit,  -T,  say.  The  process  they  suggest  for  hading  the  e  s 
Seating.  Once  inline  found  to  minimirt  S(.,«,  the  value  of  e>  that  manmo.es  the  hlrdihood 

(again  ignoring  the  term  JMnl1^3)  i®  pven  by 


S(aJ) 

v  —  • 

fl 


4.3.3.  Method  of  Moments  Estimators 


Given  the  autocorrelations,  the  coefficients  and  error  variance  for  AR,  MA,  and  ARMA  presses  can 
be  found,  using  the  relationships  between  the  true  p’s  and  true  coefficients.  A  natural  way  to  «*®jAe<he 
coefficients  of  these  processes  is  to  use  estimates  of  the  autocorrelations  in  these  procedures.  Such  estimators 
are  called  method  of  momenta  estimators.  For  AR  processes,  the  method  of  moments  estimates  use  the 
Yule-Walker  equations  with  sample  autocorrelations  substituted  for  true  autocorrelations^  J 

estimates  are  thus  called  the  Yule-Walker  estimates.  For  an  AR (p)  process  the  Yule-Walker  estimates  have 
the  same  asymptotic  properties  as  the  maximum  likelihood  estimators;,  that  is,  they  are  asymp  y 


efficient. 
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MA  Processes 


ft.  S'e“^ARMA  -  — ■  *•  -  «A  process, 

I 

f(u)~  ^  R(v)cos2*vu> 

•=-f 

ia  positive  for  all  u  €  (0, 1]. 

4.3.4.  Estimation  for  AR  Processes 

»on  fiSfSSr-?  “«  widdy  u^d  in  both  sped, si  e.tim.tk>n  Id  this  action  «  *0n*id«  some 
non- ML E  estimation  procedures  for  such  processes.  Throughout  the  section  we  assume  tfcaf  »v.  , 

mean  has  been  removed  from  the  data  being  analysed.  that  8amp,e 

Yule* Walker  Estimators 

■  T"!"?*  ^  °f  edinmtin,  the  pvametem  a  end  <r>  of  «,  AR(p,« ->)  pm,...  j, 

to  substitute  the  sample  autoeovarianc  ft  for  the  tree  autoeovamnces  ft  in  the  Yole-W^er 

then  solve  the  resulting  sample  Yule-Walker  equations:  equations  and 

f,A  =  -r  and  = 

>=0 

^1*S.CTOEwT“^w"  (YWE)  4  “d  **•  The  .amiability  of  Levinson’, 

evidence  .A  ’**  °f  ™W'  «■»  » 


Theorem  4.3.3 


Properties  of  YWE’s 


let  a  and  a1  be  tie  YWE’s  calculated  from  a  sample  X(l),...,  X(n).  Then 
FM  -n^f  °r ^  ?,*  °rdbat/  !CaSt  ^UareS  «timatora  ^e  regression  problem  y  =  -Xn  +  e  where 

(.+5  >;  ".Hi  x  m,"  “X)n  v"io,),= w> . xw'° . °>t’ 

b)  The  zeros  ofg{z)  =  Y$a0&jX*  are  guaranteed  to  be  outside  the  unit  circle. 

CJ  . X(n)  is  a  sample  from  a  Gaussian  ARfp,  o,  <rJ)  process  then 

-ra-MT  ;])• 

wfiere  S(a)  is  tie  Sdmr  matrix  corresponding  to  a. 

Implications:  Thia  theorem  gives  two  important  rcaulta  in  addition  to  verifying  onr  observation  about  tb. 
n»  of  tore.  «  prtwm  fertmobmrvcd  dau.  Fimt,  the  fittod  proem,  is  jumLteed  toTsUb”  Th™  £, 
Veetrd  dennty  -  sure  to  be  positive.  Further,  if  we  me  the  4  in  tb,  ARpredictbn  fomtulal  we  have* 

X(n  +  i)  =  -  £  &iX(”  +  J-  1),  j-  1,2,..., 

i=i 
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and  the  predictors  will  converge  to  zero.  The  second  result  is  that  the  YWE  s  are  asymptotically  efficient, 
that  is,  they  have  the  same  large  sample  properties  as  the  maximum  likelihood  estimators. 


Properties  of  Sample  Partial  Autocorrelations 

The  sample  partial  autocorrelations  play  an  important  role  in  determining  models  for  time  series.  In 
the  next  theorem  we  describe  some  basic  sampling  properties  that  they  have. 


Theorem  4.3.4 


Properties  of 


Partial  Autocorrelations 


If  X  is*  Gaussian  AR(p)  process,  then  the  sample  partial  autocorrelations  for  lags  p  + 1  and  higher 
are  asymptotically  independent  and  identically  normally  distributed,  a re  asymptotically  unbiased,  and  have 
asymptotic  variance  1/n. 


4.4.  Identifying  ARMA  Models 

The  problem  of  determining  what  type  of  ARMA  model  best  fits  a  data  set  has  become  very  important 
in  time  series  analysis.  In  this  section  we  describe  a  variety  of  diagnostics  that  have  used  to  identify  models. 

4.4.1.  Some  Useful  Diagnostics 

Suppose  that  we  have  a  realization  X(l), . . . ,  X(n)  from  a  covariance  stationary  time  series  X.  We 
saw  in  Lecture  3  that  the  true  autocorrelations,  partial  autocorrelations,  and  spectral  density  have  certain 
characteristics  for  different  model  types.  The  statements  below  are  of  necessity  rather  general  and  somewhat 
vague.  In  later  parts  of  this  section  we  will  describe  some  methods  that  have  been  developed  to  try  to  use 
these  statements  in  a  somewhat  automatic  way. 


AR  Processes 

The  autocorrelation  function  decays  to  zero  and  then  oscillates  about  zero,  while  the  partial  autocorre¬ 
lation  becomes  identically  zero  for  lags  greater  than  the  true  order  p.  The  decay  of  p  can  follow  a  sinusoid al 
pattern  if  some  of  the  zeros  of  its  characteristic  polynomial  are  complex.  The  spectral  density  of  an  AR 
process  can  contain  very  sharp  peaks,  while  the  troughs  appear  somewhat  less  sharp. 

MA  Processes 

The  autocorrelation  function  is  identically  zero  for  lags  greater  than  the  order  of  the  process,  while 
the  partial  autocorrelation  function  does  not  become  identically  zero.  The  peaks  in  the  spectral  density  are 
smooth  relative  to  what  can  be  attained  for  AR  models,  while  the  troughs  can  be  rather  sharp. 

ARMA  Process 

Here  the  autocorrelation  function  decays  in  the  same  way  as  that  of  an  AR  process  but  only  after  lag?. 
Again  the  partial  autocorrelation  does  not  become  zero.  The  spectral  density  function  can  now  have  either 
sharp  or  rounded  peaks  and  troughs. 

The  statements  made  above  are  for  models  that  have  nonzero  coefficients.  If  only  a  few  of  the  coefficients 
of  a  model  are  nonzero,  then  further  statements  can  be  made.  We  consider  two  examples  of  this,  although 
there  are  a  wide  variety  of  possibilities.  First,  if  we  have  an  MA  or  ARMA  model  with  only  a  few  nonzero 
coefficients,  then  p(v)  can  be  small  for  some  lags  smaller  than  q.  Second,  if  A  is  an  AR (p)  with  only  or. 
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4.4.2.  The  AIC  and  Related  Criteria 


"eh  •  thole.  b  do.  to  Ahaihe  !L  b  IhJ^fed MwL?  £*•  *?“  *'?.  "P"0?**  <* 

■kkW.  fo,  a  data  aet  b  competing 

one  that  beet  “fit."  the  data  hi  eome  eenae  Pot  examnle  P"*"*1*?'  "l?1  "  wouId  Probably  cbooae  the 
the  dependent  variable  and  each  ie  in  temu  of  a  JSrf  .Vw  lw°  P”“iM'  ”»deb  for 

dtotrie  that  model  that  lead,  to  tbe  'mX  .„  *  °  fmdepend.nt  vui.ble.,  then  »e  would  probably 

nnmbeia  of  patameteie,  then  .« ZiioZZ  t »■* 
regreeeion,  it  b  a  well-known  phenomenon  that  addW  .  variable ^  .3^  °°  S  ~  ,  A'“’  “ 

:»hr..rfr^^^ 

pmbbma,  increaaing  the  comply  of  a  model  mjy  reduce 

that  baa  f  =*(i  *  "f* jr  by"  <hK*r’  Ak*ft'  Peopoeed  measuring  the  "goodneee"  of  .  model 


AIC(r)  =  -2  log  L(S)  +  2  r, 

.  tzz  itrtsriist  <**  ™  *».  •«  ■.«. 

the  model.  The  AIC  b  then  need  bZiSluZl*  tol  **  “mk«  »f  paranwte,a  in 

*10  b  the  mm.  fo,  ri, -fttfittlT 

AIC(r)  =  n  log  +  2r, 

-here  d?  b  the  nuximnm  likelihood  -Innate  of  the  „,o,  variance  for  the  model  haring  r  parameter 
The  CAT  Criterion  and  AR  Models 

criterion  due  an*  ARM  f  “  *u‘oreP«*ive  Process  is  the  CAT 

determining  the  order  p  of  an  autoregressive  nroeea*  tK»t  Proce“»  ?*nen  considered  the  problem  of 

~*~xzrz  .^rrrr™s.raw*y  *• 


Definition.  Let  be  the 

sempk  of  site  n,  tad  let 


error  variance  estimates  from  the  Yule-Walker 


equations  based 


on  a 


i=l . Af. 

Then  the  criterion  auloregreerive  tranrfer  fnnction  fCAT;  criterion  for  order  t  b  defined  to  he 


*=  l,...,Af 
t  =  0. 
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Note  that  the  natural  way  to  define  CAT(O)  would  be  as  just  -R( 0)  in  which  case  it  has  been  shown 
that  the  AIC  and  the  CAT  criterion  are  asymptotically  equivalent.  Parzen  has  suggested  the  definition  above 
so  that  CAT  can  be  used  as  a  test  for  white  noise.  As  we  pointed  out  above,  if  X  is  an  AR(0)  (that  is,  white 
“oise)  process,  then  AIC  (and  hence  CAT  with  CAT(0)=  -R(0))  would  asymptotically  only  choose  order 

zero  approximately  75%  of  the  time. 

AIC  and  ARMA  Models 

The  AIC  is  also  applicable  for  choosing  the  order  of  ARMA  processes.  Ideally,  we  should  use  the 
exact  maximum  likelihood  method  for  each  possible  order  and  then  calculate  AIC  based  on  the  maximized 
likelihood.  Unfortunately,  this  is  very  time  consuming  and  in  some  cases  (particularly  when  the  process  is 
close  to  nonstationary  or  noninvertible)  it  is  difficult  to  get  the  maximization  procedure  to  converge.  An 
alternative  is  to  use  the  estimate  of  the  error  variance  obtained  from  the  method  of  moments  estimator  for 
each  order.  This  also  can  be  difficult  to  implement  since  in  some  cases  the  factorization  of  the  cowiance 
generating  function  becomes  infeasible.  Thu.  in  TIMESLAB,  we  suggest  using  the  consistent  but  inefficient 
method  described  below  to  estimate  the  error  variance  for  each  order. 

* 

4.4.3.  The  Stepwise  ARMA  Method 

Since  an  ARMA  model  looks  very  much  like  an  ordinary  regression  model,  it  is  natural  to  apply  regression 
techniques  in  their  analysis.  In  fact  we  saw  above  that  applying  ordinary  least  squares  to  autoregressive  data 
leads  to  asymptotically  efficient  parameter  estimates.  In  the  MA  case  this  was  not  true.  In 
extend  the  regression  analogy  to  the  ARMA  case  by  describing  the  TIMESLAB  command  AJUUSEL  which 

performs  a  stepwise  ARMA  procedure.  .  w  *  , 

Suppose  we  feel  that  an  ARMA  model  of  some  order  (p,q)  should  fit  a  data  set  X(l),...,X(n)  and 
(P,Q)  is  an  upper  bound  on  the  order.  Then  in  analogy  with  stepwise  regression  analysis,  we  would  wnte 

X(t)  =  -<*i X(t  -  1) - crj»X(t  -  P)  +  &«(*  -!)  +  •*’  +  09e(*  ~  Q)  +  €W' 

that  is,  we  would  regress  X(t)  on  previous  X’s  and  e’s.  Since  we  cannot  observe  the  c’s,  we  cannot  do  the 
regression  directly.  The  command 

AMUSEL(x,n,»,s,kl,k2,kopt,pval,p,q,  alpha, bsta,rvar,isr) 

carries  out  a  stepwise  version  of  the  regression  described  above.  The  first  eight  arguments  are  input  and  the 
last  six  are  output  although  alpha  and  beta  can  be  used  as  both  input  and  output  if  kept  is  negative  s 
described  below.  The  final  model  chosen  has  orders  p  and  «  returned  in  the  integers  p  and  q,  “efftomj* 
a  and  0  returned  in  the  arrays  alpha  and  bsta,  and  noise  variance  returned  in  the  real  scalar  rvar.  The 
output  integer  ier  is  0  if  no  errors  are  encountered,  while  it  is  1  if  the  matrix  being  swept  is  judged  to  be 

““^e^^nto  *,  a,  ■,  and  .  contain  the  data,  the  sample  size,  and  M  abd  *  as  described  above^ 
The  argumentokl,  k2,  kept,  and  pval  allow  the  user  to  use  ARIUSEL  in  a  variety  of  ways.  First  the  real 
scalar  Jvll  is  the  >value  to  enter”  at  each  step  of  the  stepwise  procedure.  Thus  using  a  small«  value 
for  pvi  allows  more  variable,  to  enter  the  model.  Then  >,  .,  kl,  *2,  and  kept  and  possibly  alpha  and 
beta)  together  describe  what  lags  are  contenders  for  inclusion  in  the  model.  Firat,  the  largest  AR  or  MA 
lag  possible  is  m-s  as  described  above.  Then  we  have  the  following  rules  based  on  the  value  of  kopt. 

-  The  kl  AR  lags  (out  of  the  possible  n-s)  having  the  largest  Rxx(v)  in  absolute  value 
and  the  k2  MA  lags  having  the  largest  Rx<(~v)  >n  absolute  value  are  contenders  for 
inclusion. 

-  This  is  the  same  as  kopt=0  except  that  the  stepwise  procedure  continues  until  j  lags 
are  included. 


kopt=0 


kopt=j>0 
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The  kl  AR  lags  that  are  entered  in  alpha  and  the  mmi  u_  4V  * 

“«  forced  into  the  model.  Thu«  kl+k2=  _j.  MA  ^  that  “*  entered  in  beta 


Subset  Autoregression 

exraple,  for  the  l««  *ome  time  aeriee.  Foe 

nooeero  hoe  e  velue  of  A1C  emellet  °f*P  1,  2,  4,  10,  ond  „ 

fit  such  models.  *  fuU  model.  The  AKMASEL  command  can  be  used  to 
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Forecasting  Using  Models 


In  this  lecture  we  discuss  forecasting  based  on  both  ARMA  models  aind  regression  models. 

5.1.  Box-Jenkins  Modeling  and  Forecasting 

The  most  popular  ARMA-model  based  forecasting  method  is  one  proposed  by  Box  and  Jenkins  The 
method  uses  the  traditional  statistical  modeling  strategy  of  (1)  model  identification,  (2)  parameter  estima¬ 
tion  (3)  diagnostic  checking,  and  (4)  consideration  of  alternative  models,  if  necessary,  within  the  class  of 
multiplicative  seasonal  ARIMA  models  to  obtain  a  model  suitable  for  forecasting.  In  this  section  we  de¬ 
scribe  the  various  steps  in  the  method.  The  appeal  of  the  method  is  its  simplicity  and  the  wide  availability 
of  computer  software  for  using  it. 


The  Models 

The  Box-Jenkins  method  for  data  X(l), . . .,*(")  assumes  that  if  we  transform  X  to  a  series  W  by 

W(t)  =  (l-  L)'(l  -Ls)DY(t), 


where 


f(X(0  +  m)\  A#0 

yH, 


log(X(t)  +  m),  A  =  0, 
and  m  is  chosen  to  make  X(t)  +  m  positive  for  all  t  (unless  A  =  1),  then  a  model  of  the  form 

with  e  ~WN(<r3),  will  adequately  represent  W.  Note  that  this  model  for  W  can  be  written  as  an  ARMA(p+ 
PS  q  +  QS)  model  by  multiplying  the  polynomials  in  the  model.  One  of  the  appeals  of  the  Box-Jenkins 
models  is  that  many  of  the  coefficients  in  this  general  ARMA  model  are  allowed  to  be  zero  Thus  tlm  class 
of  models  allow  us  to  represent  data  parsimoniously,  that  is,  with  a  small  number  of  parameters.  This  is  the 
same  idea  as  in  the  stepwise  ARMA  modeling  embodied  in  the  ARKASEL  command. 

The  model  for  W  is  called  the  multiplicative  seasonal  ARMA  model  with  orders  {p,P,q,Q,d,V >*/  ana 
coefficients  (a, 9,6,1).  We  will  generalize  this  model  slightly  by  removing  the  restriction  that  the  powers 
of  L  in  the  Lcond  and  fourth  polynomials  above  are  of  the  form  kS  and  rS.  This  allows  us  to  include  the 
general  subset  ARMA  model  in  the  class  of  models  that  we  can  consider.  Thus  we  have  the  model 

g(L)G(L)  [W(<)  -  p]  =  h(L)H(L)({t), 
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-Ht.  the  mode!  .ith .  eooe.m  0,  “  lhe  mo,iel  “  “d  .-vortibte.  We  ««,  dm 

p(I)G(L)W(t)  =  Tb  +  h(L)H(L)t(t), 


where 


n>=g(l)<7(l),< 


■(?■■)  (s-)- 


w  .^,z!r,£E5U?3  zzxzt Y  “Ar  10  ***•  •  *• 

Sr11- ,kt °r  iti  —  « 

Model  Identification 

The  model  identification  step  itself  consists  of  three  parts: 

1  rxr  rr^L^e  ,y(1i  x^,r‘  — — - 


2.  Determine  if  the  differencing  transform  from  Y  to  W  is 
5  should  be  used. 


necessary,  and  if  so,  what  values  of  d,  D,  and 


’  Sdt'^ed"  ARMA  m“ld  ‘  ",”i"a  to  m0‘1'1  W'  tf  “■  »k“  -1—  «  the  otdoe  end  Up 

Identifying  A 

“^reoX^^i1:  sar  v^r™ y  r»  -hidi 

Thie  dots  eet  consists  Vjw ’22X 

at  a  faster  than  linear  rate  If  no  newer  tran«fftrm.«  J • r?7  who8e  «npl»tude  is  increasing  with  time 

cycle.  For  example,  we  have  inckded  in  Fi«i«Tf  *?P^eJ  "*  *mount  of  differ«cing  can  remove  the 

»Tio«omom,r°d'.Sw?.m  Cnf"‘“*|'ly.';Wch method U me is somesshst 
Urn  highest  mhe'rf *>  °f  V*  “J  d,M“  *•  -  k» 

jr\0 =»+«+«<».  £&±i>+<(0, 

l“^i«ul^to™n^bb"^  TL'TJH  ■’?  ^  «*•  ‘k« 

November.  **  daU  ,tart  “  J“u«y  “d  *•  maximum  yearly  value  is  * 
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Figure  5.1.  Example  of  Identifying  A. 

In  Figure  5.1  we  have  included  a  graph  of  R2  for  51  values  of  A  between  .1  and  .6,  and  the  graphs  of  Xx 
for  A  =  0,  A  =  .24  (the  value  giving  the  maximum  Ra),  and  A  =  .6.  The  log  transform  is  ruled  out  because 
of  how  poorly  the  first  cycle  of  the  cosine  fits  the  data. 

The  method  of  the  previous  paragraph  is  appropriate  for  these  data  because  the  transformed  data  do 
seem  to  be  sinusoidal.  In  general,  one  chooses  the  transformation  so  that  the  resulting  data  most  closely 
match  some  deterministic  function  that  can  be  removed  by  differencing. 

Identifying  Differences 

In  this  part  of  the  process,  one  looks  for  the  least  amount  of  differencing  required  to  produce  a  correlo- 
gram  and  partial  correlogram  that  can  be  matched  by  that  of  a  multiplicative  subset  ARMA  process  having 
only  a  few  parameters.  This  is  not  always  an  easy  process  as  it  requires  rather  extensive  experience.  In 
Section  3.5  we  described  a  variety  of  types  of  correlograms  and  partial  correlograms  and  the  corresponding 
models.  Unfortunately  no  such  description  can  be  exhaustive.  Note  that  doing  more  differencing  leads  to 
losing  observations,  while  doing  less  may  lead  to  having  to  use  more  parameters  in  the  ARMA  model.  The 
probability  limits  for  prediction  that  we  will  obtain  below  are  a  function  of  the  number  of  observations  minus 
the  number  of  estimated  parameters.  Thus  there  is  a  tradeoff  in  degrees  of  freedom  in  these  two  parts  of  the 
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identification  process. 

To  illustrate  differencing,  consider  again  the  sales  data  In  T.Ki.  .. 

“"el0‘"rar“  *he »—  -  fc'i  T£2Z fJSTS 

Truudofnltd  Sdre  °^r,'l,tK>“  “<*  •’“‘■•I  Correlation!  for  Virioni  Difcreneed  Veriiom  of  lie  Power 


Corrslatloas  tf  y 


1  1 

B9  .Tl 

.45 

.19  -.01 

13  | 

.61  .46 

.35 

.03  -.14 

35  | 

.38  .38 

.09 
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Partial*  af  T 

1  1 
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.08  .38 

13  | 
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13  | 
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Partial*  *f  l*t,13tk  DiCfaraac* 

1  I  -.80  .01  .00  -.10  -.13  ..08  -.10  -.31 

13  I  -.16  -.13  -.16  -.05  -.08  .01  .06  -.11 

35  I  -.38  .11  -.12  -.04  .01  -.10  .08  -.08 


.13 

.03 

-.07 


.04 

.14 

.08 


*0ra38863. 330000 
*0-  .387436 
tOla. 063808 
«013». 038186 
>0113- .044080 


.14 

.06 

-.32 

06  -.14 

-.20 

01  -.13 

-.14 

11 

.16 

-.30 

23 

.02 

-.11 

15  -.03 

-.16 

.36 

.44 

-.35 

.04 

-.02 

-.08 

•08  -.08 

-.06 

22 

.23 

.05 

09 

.02 

-.11 

04 

.05 

-.05 

Identifying  ARMA  Models 


dtTSre  P“l  f  “«?  <?*•**»  ARMA  reodd  tb 

functions  are  used.  To  illustrate  this  process  we  2Lti*  autocorrelation  and  partial  autocorrelati 

the  correlogram  and  partial  correlogra£  of  the  first  andWtlTJff  *  d“CU“,.°“  f the  *****  data-  Not«  th 
the  other,  don’t.  First,  the  lag  onf  k  We  »d  i  pattera  * 

an  increase  at  lags  10  and  11.  The  first  three  enJil  r  /  ^  **ver*^  *kat  ut  *10811,  followed  1 

alternating  signs,  indicating  the  presence  of  an  AR/l^tT*  ~  ***  decay*nf  exponentially  wi 

The  presence  oHarge  corrdatLTl.*  5“  ^12 ^  ^^a^  ■ 

™  ’  “d  12  “dicates  the  presence  of  a  subset  MA  term  in  tl 
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data.  Since  the  maximum  correlation  is  at  lag  11,  it  is  tempting  to  conclude  that  the  MA  term  should  be  of 
lag  11.  In  Table  5.2  we  give  the  true  autocorrelation  function  for  the  best  fitting  models  having  an  AR(1) 
term  and  then  either  an  MA(ll)  term  or  an  MA(12)  term. 


Table  5.2.  True  Correlations  for  Two  Models  for  Sales  Data 


Cotff icluts  Fro*  3EASEST 

1  |  .504522  .780130 

nir* .  025497 

Correlations  for  A1(1)»1U(11) 

1  |  -.50  .26  -.13  .06  -.02  .00 

13  |  .12  -.06  .03  -.02  .01  .00 

Coefficients  Fro*  SEiSEST 

1  |  .501941  -.801871 

rrar«. 024118 

Correlations  for  AE(1),HA(12) 

i  |  -.50  .25  -.13  .06  -.03  .01 

13  |  .24  -.12  .06  -.03  .  02  -.01 


.02  -.00 
.00  .00 

.12  -.24 
.00  .00 

.48  -.24 
.00  .00 

.01  -.03 
.00  .00 

.06  -.12 
.00  .00 

.24  -.49 
.00  .00 

w. 


» 


The  residual  variance  for  the  model  having  the  MA(12)  term  is  smaller  than  that  for  the  model  having 
the  MA(ll)  term  but  the  correlations  for  the  latter  model  appear  to  match  those  of  the  data  better.  There 
seems  to  be  no  clear  optimal  choice  between  the  two  models. 

Estimation  and  Diagnostic  Checking 

Once  a  model  has  been  determined,  the  approximate  likelihood  estimation  procedure  is  used  to  estimate 
its  parameters,  while  the  portmanteau  test  is  used  to  determine  if  the  residuals  of  the  fitted  model  are  white 

noise. 

Forecasting 

Given  estimates  of  the  parameters  of  a  model  that  has  been  judged  to  be  adequate,  the  Box-Jenkins 
procedure  then  produces  forecasts  as  follows.  For  simplicity  we  suppose  that  p  =  0.  Then  we  have 

g(L)G(L)(  1  -  L)*(l  -  Ls)DY(t)  =  h{L)H(L)€{t ), 

where  Y  is  the  power  transformed  series.  The  result  of  using  the  SEASEST  command  is  estimates  of  the 
coefficients  of  the  four  polynomials  g,G,ht  and  H,  that  is,  the  parameters  of  the  model  for 

W(t)  =  (1  -  L)*(l  -Ls)DY(t). 

This  model  for  W  can  be  translated  into  a  model  for  Y,  and  thus  we  can  write  Y  as  an  ARMA  model  of 
orders  p*  =p  +  uP  +  d+SD  and  5*  =  q  +  vQ.  Note  that  if  differencing  was  done,  then  some  of  the  zeros  of 
the  polynomial  for  the  AR  part  of  the  model  for  Y  will  be  on  the  unit  circle. 

Now  given  this  ARMA(p*,j’)  model  for  Y ,  a  recursion  for  the  forecasts  of  Y  can  be  used.  This  recursion 
is  similar  to  the  ones  used  in  the  approximate  MLE  procedure.  Once  forecasts  for  Y  are  obtained  they  can 
be  converted  to  forecasts  for  the  original  series  X  by  doing  the  transformation  that  is  the  inverse  of  the  one 
done  in  the  power  transform  step. 

Consider  Figure  5.2  which  contains  the  sales  data  with  forecaste  of  the  next  24  values  of  the  series  and 
95%  probability  limits  appended. 
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Figure  5.2.  Sales  Data  and  Forecasts  with  95%  Probability  Limits. 

5«2«  Other  Modeling  Strategies 

d.temiistie  functions  of  SL  do"  l)'  *”d  not  by  repesein,  tbc  del.  on 

5.2.1.  Regression  with  Autoregressive  Errors 

antoconelated^th^'b^  whue™0*1*1  **  *  determm,8tic  function  of  time  “d  the  errors  appear  to  be 

y  =  Xfi+e, 

srr  -vs  w,) . “"»T  h“ 

«*  some  code,  p.  W.  enn  eslimst.  tb.  peremeter,  by  tbe  foihmin,  Si™  P'°““ 

I.  Use  ordinary  least  squares  to  find  initial  estimates  4o  and  residuals  e0. 

J.  Determine  tb.  order  ft,  eodUdent.  ft,  nnd  error  wris.ee  if  of  m,  sutorepeuive  model  for  ... 

3  2toy  id  X.°b,er,*ti0”  ”Cl°1 11  *"<l  rf*T'Mion  mltrix  wl  *y  npplyinj  the  AR  biter  found  in  step 

4'  ^”^-XAdi°“J'  l'“‘  *°  *>  W>  “  »>«  coefficient  estim.te  A  nnd  reriduel. 


5.  Return  to  step  2  with  e,  replacing  eo. 


This  process  continues  until 
values  of  the  quantities  involved 


successjve  iterations  result  in  the  same  value  of  the  AR  order.  The 
will  have  subscripts  1,2 . 


successive 
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One  possible  model  for  the  log  (base  10)  of  the  lynx  data  is 

,  .  2irt  .  2xt  .  . 

x(t)  =  /i  +  a  cos  —  +  6sm  —  +  «(t), 

where  t  is  an  AR(p).  Using  the  procedure  described  above,  we  find  p  —  2  with  coefficients  -1.037  and  0.351, 
while  the  estimates  of  the  regression  coefficients  are  2.901,  0.354,  and  -0.469. 


5.2.2.  ARARMA  Modeling 

One  very  simple  method  for  automatically  modeling  and  forecasting  data  has  been  suggested  by  Parzen 
(1982)  and  used  successfully  by  Newton  and  Parzen  in  a  forecasting  competition  organized  by  Makridakis 
(see  Makridakis  et  al.  (1984)).  The  method  consists  of  two  parts: 

1.  For  some  maximum  lag  Mi,  calculate  the  regression  coefficients 

YjX(t)X(t  +  k) 

&  =  -i5L_ - .  *1- 

£*3(o 

i=i 


Then  let  m  be  the  value  of  k  having  the  largest  |4*|,  and  form 

e(t)  =  X(t  +  m)  +  PmX(t),  t  —  l,...,n  m. 


2.  Fit  an  autoregressive  process  to  the  e’s,  using  maximum  possible  order  Afj  and  an  order-determining 
criterion  to  determine  the  order  p  to  use.  In  the  general  ARARMA  procedure  one  would  fit  an  ARMA 
model  to  the  e’s,  but  in  most  cases  an  AR  process  is  adequate. 

The  first  part  is  called  the  first  AR  since  it  is  in  essence  fitting  a  one  lag  subset  AR  model  to  X. 
The  second  part  is  called  the  second  AR.  Note  that  the  first  AR  may  result  in  a  coefficient  that  is  greater 
than  one,  while  in  the  second  AR,  it  is  recommended  that  the  Yule-Walker  or  Burg  estimators  be  used  to 
guarantee  that  the  process  fit  to  the  e’s  is  stationary. 

The  result  of  this  procedure  is  a  model  of  the  form 

(l  +  AaImW(W«)  «€(*), 

where  g  is  the  AR  operator  determined  in  part  2  of  the  procedure.  This  model  is  similar  in  form  to  the 
Box-Jenkins  model  (with  no  MA  terms)  except  that  it  is  easily  made  automatic,  and  the  data  determine 
the  nature  of  the  first  AR,  which  is  analogous  to  differencing  and  is  in  fact  sometimes  referred  to  as  “quasi- 
differencing”.  If  the  first  AR  turns  out  to  be  stable,  that  is,  the  coefficient  has  absolute  value  less  than  one, 
then  the  forecasts  will  eventually  converge  to  the  mean  of  the  observed  data.  As  we  discussed  in  Lecture  1, 
forecasts  that  follow  a  difference  equation  of  necessity  either  are  explosive  or  must  converge  to  some  finite 
value.  In  the  short  run  this  may  not  be  troublesome,  but  if  the  analyst  has  a  feel  for  the  long-run  nature  of 
forecasts,  then  this  information  should  be  incorporated  into  whatever  model  is  used,  either  by  modeling  this 
behavior  using  a  regression  model,  or  by  using  differencing  if  polynomial  growth  is  expected,  or  by  insisting 
on  a  stable  ARMA  model  if  the  series  is  expected  to  remain  fairly  constant. 
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Figure  5.3.  Seles  Data  and  Forecast.  A.  Determined  by  DTFORE. 

The  Sale.  Data  Again 

L'  “ 10  the  fo"th  «* <*  «*  -h. 

U»  SSmm,  «,«*  S»  The  lint  AR eneetM* 

mean  twelfth  difference.  In  data  such  as  thif  this  auaSESf  D°“Utionary  th“  the  -1  that  would 
d»ta,  and  at  least  in  the  short  term  future,  the  nuriel  vLd  °f  **  var“tion  “» the 

impact  on  the  nature  of  the  forecast.  Note  furtheH*  ill"  ,  ,feond  ***  of  the  model  fitting  has  little 
to  more  explosive  growth  than  the  actud*  1^^“^  ^  qu“*-differenc“«  will  ultimately  lead 

by  the  x  and  the  forec^U  th^^t^^vSdu^f  thewri  (the  &Ctual  dsU  »*  represented 

hy  10  pnor  to  doing  the  plotting.  Thus  this  firure  is  enmnJ^li ^Ue*  *°  Pitted  were  again  divided 
If  the  first  AR  chosen  by  DTFORE  is  stable  then  the  sriensm  one  ua‘n*  *h*  Box- Jenkins  method, 

hmits  for  the  forecasts.  The  intent  of  DTFOIB^ow^erTto^  CTmMld  *“  be  to  find  probability 
forecasts  that  have  been  shown  (in  the  forecasting  tamJSJF  w”*  *“  easy-to-use  command  that  gives 
by  more  elaborate  methods.  *  repetition)  to  compare  very  .favorably  with  those  given 
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Ip  this  lecture  we  consider  the  classic  problem  of  searching  for  periodicities  in  time  series.  The  basic 
tools  in  this  search  are  the  spectral  density  function  /  and  sample  spectral  density  function  f:  In  Section  6.1 
we  consider  nonparametric  estimation  of  /,  that  is,  estimation  performed  without  making  any  assumptions 
about  the  form  of  the  true  /.  Then  in  Section  6.2  we  consider  using  ARMA  models  for  estimating  / .  Finally, 
in  Section  6.3  we  use  the  results  of  the  first  two  sections  to  actually  search  for  periodicities  in  some  famous 
series. 

6.1.  Nonparametric  Spectral  Density  Estimation 

We  saw  in  Lecture  4  that  the  sample  spectral  density^  function  /  is  an  inconsistent  estimator  of  the  spec¬ 
tral  density  function  /.  In  this  section  we  describe  how  /  can  be  modified  to  produce  consistent  estimators. 

6.1.1.  Smoothing  the  Sample  Spectral  Density 

The  basic  problem  with  /  is  that  it  is  too  wiggly  to  be  an  adequate  estimator  of  a  function  that  is 
typically  smooth  over  much  of  its  domain.  In  this  section  we  consider  a  simple  averaging  approach  to 
smoothing  the  periodogram. 

Recall  from  Theorem  4.1.3  that  under  general  conditions 


/(0),/(i) . ffc®) 


are  asymptotically  independent  random  variables  with 

/(«i) 

i 

/(< 


«j)~< 

l  vj=  0..5. 

Consider  estimating  /  at  one  of  the  natural  frequencies  uij  =  (j  -  l)/n  by 


is-m 


that  is,  by  the  average  of  /(w;)  and  the  m  values  of  the  periodogram  on  either  side  of  it.  If  <  0  or 
u>j+m  >  .5,  we  can  use  the  fact  that  /  is  symmetric  about  0  and  .5_to  obtain  the  elements  in  the  sum. 

Now  suppose  that  Uj.m  >  0  and  uJ+m  <  .5.  We  can  think  of  f(uj)  as  averaging  all  of  the  periodogram 
values  in  the  frequency  band  u,  ±  m/n,  that  is,  as  a  smoother  having  bandwidth  m/n.  We  consider 
bandwidth  to  be  half  the  width  of  the  frequency  band,  that  is,  the  width  of  the  interval  on  each  side  of  the 
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asymptotic  properties  of  /  to 


E  fM 


Ac-m 


V«(/(4.f))  (2m  +  1)  /V,«). 

”^U,11'“  **/•  “  w'»  "  <b«  tMdeoff  betneen 
and  lead  to  including  values  of  /  at  frequencies  farther  f  ^“,1°^”’  u*cr*“,n*  m  wfll  increase  the  bandwidth 
/(<*),  thus  increa.bg  the  po«ik!y  ZZZ  ^  **  “  «pected  value  of 

will  not  increase  appreciably.  On  the  other  hand,  increibr^rt^^V?^4 * ^ ’  tben  th“  bi“ 
the  estimator,  again  assuming  that  /  does  not  W  kaa  the  cffcct  o{  de<***«ng  the  variance  of 

«.« the  .implto*  c„c  i.  Z  X  Urt"  .WcL^T  “  *-«W  *»m  Note 

the  bandwidth  without  increuio,  the  buo  -vo- ”  >  A  w-/  “  "»  Ml.  we  can  increase 

other  extreme,  we  might  be  estimating  /  at  a  frequency  that  btdb^L*  ?*  I**!  °f  Y(2m  +  1J*  At  tbe 
peak,  in  which  case  increasing  m  could  increase  both  the  bias  aJfc  •  *  b“l7rher* 1  rise9  A«Ply  to  a 
as  leakage  as  high  power  in  /  in  one  frequency  band  “i-,v  »  •  tb*  vamace-  phenomenon  is  known 
As  long  as  /  is  bounded  we  ^  fet  m  eo  t^i  °f/ at  otber 

xero.  As  m  and  n  get  large,  we  have  *  »»  "  goes  to  eo  so  that  the  variance  of  f(Uj)  goes  to 

r>*. 

*  “■*  ik"  u“ 

-  r  i.fc: fM  *  *?>  *m  *—  •**— 

nonoverlapping  frequency  bands  ^on  *vera**ng  values  of  the  periodogram  for 

Wo  «  conoide,  the  -fmptoUc  di.Wbu.it,  of /(„,).  For  >  ,  „d  <  ,5, «  k.„ 

^>“srrr  E  ap*l<iw 

distributed  roudom  wriablc. 

fc**"  «?*•*•■ -St».  tadMd«-  dZTSZ^  feTfr'iSES-^ if*"!  - 


f(uj)/f(uj)-  Thus  we  have  “*  tb“  device  to  ‘PPrcodmate  the  distributbn  of 

£((?)  =  cv  =  e(^1)  =  i 
vK)/ 

V*(Q)  =  2c*i/  =  _ _ L__ 

v(«i)/  2(2m  + 1)' 

We  then  set  a/  =  1  and  2c3*/  =  l/2(2m  + 1)  and  obtain  esl/vaivs  2(2 m  + 1),  which  gives 


/("/) 


w>  #0,.5. 
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Note  that  in  this  simple  averaging  case  we  could  have  obtained  this  immediately  from  our  assumption  that 
/  is  constant  around  Uj . 


Estimating  /( 0)  and  /(.5) 

The  arguments  given  above  must  be  modified  when  we  consider  estimating  /  at  frequency  0  or  .5,  as 
then  the  values  of  /  on  each  side  of  the  center  frequency  are  the  same.  If  we  write,  for  example, 


/( 0)  = 


1 


2m +  1 


/(0)  + 2  £/(«,+*) 


1=1 


we  find 


vftuj) 

2  /(wj) 


xl/ a. 


Uj  —  0,  .5. 


6.1.2.  Kernels  and  Fourier  Series  Approximations 

* 

The  next  natural  step  in  estimating  /  is  to  insert  a  weight  function  into  the  averaging  of  the  periodogram 
so  that  values  of  /  at  frequencies  far  removed  from  Uj  get  less  weight  than  values  at  frequencies  close  to  uj. 
We  consider  integrated  weighted  averages  of  /  of  the  form 

f(u)  =  J\(u-T)f(r)  dr, 

where  K  is  a  weight  function  called  a  (spectral)  window.  For  a  fixed  frequency  u,  this  expression  essentially 
says  to  superimpose  the  weight  function  onto  the  graph  of  the  periodogram  (with  its  value  at  zero  centered 
at  frequency  w),  then  multiply  the  two  functions  together  (which  down-weights  values  of  /  far  removed  from 
the  “center”  frequency  u),  and  finally  do  the  integral.  To  find  the  estimate  at  a  different  frequency,  the 
same  process  is  carried  out  with  the  weight  function  moved  over  to  be  centered  at  the  new  frequency.  Thus 
the  function  K  essentially  dictates  what  part  of  /  can  be  “seen”  when  finding  the  estimate  at  a  certain 
frequency.  This  is  the  origin  erf  the  term  window.  We  will  study  windows  of  various  shapes  and  discuss  the 
effect  of  their  width. 

Using  spectral  windows  then  is  a  natural  extension  of  the  idea  of  averaging  the  sample  spectral  density. 
We  could  use  weighted  averages  that  are  sums  instead  of  integrals,  but  the  integrated  averages  arise  naturally 
from  another  point  of  view,  which  we  now  describe.  Such  averages  arise  naturally  from  considerations  of  the 
general  theory  of  the  Fourier  series  representation  of  a  function  on  a  finite  interval.  An  expression  of  the 
form  * 

/(«)=  £  fit.).-1'"" 

*=-00 

where  x 

R(v)  =  jf  f(w)e3*ivudu 

is  called  the  Fourier  series  representation  of  /,  and  the  coefficients  Jf(t>)  are  called  its  Fourier  coefficients. 
The  periodogram 

/(«)•  £  sw.-’"" 

»=-(»—!) 

is  thus  actually  an  estimator  of  the  nth  partial  sum 

/.(«)«  £ 

.=-(»-!) 
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of  Fourier  «,i«  it  i.  «,1]  k';»B  ,h  “  «  !°  h*,“«  the  generel  ,heo„ 

improved  by  epplyin g  e  eequenee  of  .eight.  t,(„)  to  /i  in  ^  ‘h'“  r“"i"  *"“*  •“  k« 

n-1 

/„,t(w)  =  22  kn(v)R(v)e~Uivu 

•s-(n-l) 

=  fl  f(r)Kn(w-r)dr. 

JO 

Tb.  eequenee  h.  i.  celled  »  U,  -indo.,  end  the  funetiou 


n-1 

*»(w)  = 

**-(«-!) 


W  €  (-00,00), 


t“he  ta  oV^ngTjgTt,  £7het^^^  ■*«■*  from 

ponodie  of  period  1.  Thi.  *.  Lm  <-«•“>  -  » 


Definition.  A  spectral  estimator  of  the  form 


m  _T 

/n,*(w)  =  ^  4n(o)^(tf)e~**,"“ 

•s-(n-l) 

=  f  f(r)Kn(u  -  r)dr 

J  0 

integer  M  (called  a  truncation  point),  wetay  SisTtl  ‘t^a^tZuunca^t^  ** W  >  *  **  ^ 

which  gives  tkat^fetk  DiShfct  krael  fUnCtl°n  *Uelf  “  of  the  *><”*  form  with  identically  one, 
Windows  of  Scale  Parameter  Form 

A  nriet,  of  -eight  fuuttitm.  h.™  been  rn.gge.ted,  of  the  .eel,  peremet,,  ,„m,  ,h.,  j., 

*-W  =  A(i), 

fcr  eome  integer  M  <n  ceiled  the  Kele  peremeter,  mth  A  being  .  firnetbn  eetirfying 
1.  A(0)  =  l. 


2.  A(-u)  =  A(u). 

3-  fTeo  A3(tl )du  <  00. 
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Definition.  The  function  A  defined  above  is  ca tied  a  lag  window  generator,  while  the  function 

A(w)  =  r  A(u)e'““du,  w  €  (-00,00), 

**  J- 00 

is  called  the  spectral  window  generator  corresponding  to  A. 

For  lag  windows  of  scale  parameter  form,  it  can  be  shown  for  large  Af  that 

Kn(u)  =  2*MA(2xMw). 

In  Table  6.1  we  give  information  about  eight  of  the  lag  and  spectral  window  generators  that  are  commonly 
used.  The  last  five  columns  in  the  table  contain  numerical  quantities  that  are  important  for  judging  the 
adequacy  of  the  window.  We  describe  these  quantities  below. 

The  first  five  lag  window  generators  are  zero  outside  of  |u|  <  1  and  thus  kn{v)  =  0  for  \v\  >  M  and  fn,k 
only  involves  values  of  R(v)  for  |v|  <  Af,  and  Af  is  then  called  a  truncation  point.  This  terminology  is  often 

used  even  when  A  is  not  of  truncated  form.  . 

The  basic  feature  of  the  windows  is  that  they  decay  rapidly  from  frequency  zero  and  then  rise  backup 
again  in  what  are  called  sidelobes.  All  of  the  windows  converge  to  a  delta  function  as  M  increases.  This 
is  analogous  to  letting  the  bandwidth  of  the  simple  averaging  smoother  of  the  previous  section  go  to  zero. 
The  truncated  periodogram,  Thkey,  and  Parzen-Cogburn-Davis  windows  are  negative  m  certain  frequency 
bands,  while  the  others  are  nonnegative.  This  is  important  because  using  a  negative  spectral  window  can 

result  in  a  spectral  estimator  that  is  negative.  .  .  . 

Since  the  ™*in  lobe  of  a  window  is  not  rectangular,  it  is  difficult  to  measure  how  wide  it  is.  In  the  last 
column  of  Table  6.1  we  give  what  is  called  Parzen’s  measure  of  the  bandwidth  of  a  window,  denoted  BP  and 
defined  to  be  the  width  of  a  rectangular  window  having  the  same  value  at  u  =  0  as  Kn  and  the  same  area 
as  Kn-  Since  the  areas  of  all  of  the  Kn  axe  one,  we  have 

b  1  *  1 

~  2rAf  A(0)  tfn(0)‘ 

In  Figure  6.1,  we  have  superimposed  the  Parzen  and  Thkey  windows  for  truncation  points  24  and 1  18, 
respectively.  Note  how  the  main  lobes  seem  to  line  up,  which  could  be  predicted  based  on  the  handwidths 
of  the'  two  windows. 

6.1*3.  Sampling  Properties  of  Estimators 

As  we  have  pointed  out,  the  properties  of  a  spectral  estimator  based  on  smoothing  /  will  depend  on 
the  smoothness  of  /  and  the  properties  of  the  weighting  function  being  used.  In  the  case  of  a  lag  wuidow 
of  scale  parameter  form,  the  properties  of  the  weighting  function  will  depend  on  the  scale  parameter  M  and 
the  lag  window  generator  A. 


Theorem  6.1.1 


Properties  of 


Window  Estimators 


Let  X(l),...,  X(n)  be  a  sample  realization  from  a  covariance  stationary  time  series  X  which  has  spectral 
density  function  f  and  let  A  be  a  lag  window  generator.  Let 


VS  — 00 

where  M  is  chosen  as  a  function  of  n  so  that  M  —  oo  as  n  -  oo.  Note  that  the  limits  on  the  sum  are  actually 
—M  to  M  if  A  is  of  truncated  form  and  — (n  —  1)  to  n  —  1  otherwise.  Then 
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_Table  6.1.  Some  Window  Generator*  and  Their  CUwttmtk 


A(u)  =  2if?00H»)'iuudu  ,  A(») 

Generator!  of  Truncated  Form,  i.<„  A(u)  =  0,  |ti|  >  1 

Truncated  Periodogram  Window:  A(u)  =  1 
1  ainw 


Bartlett  Window:  A(ti)  =  1  —  |u| 
1  /ain  w/2\> 

2»\  vf 2  / 


Tukey  Window:  A(«)  =  0.54  +  0.46  eco  n 

1  »inu)  jJ 

_ 2t  v  irJ  —  u3  2 


0.23»a  0.795 


Parzen  Window 


:  =  +  Ms  * 

W-Ml*.  •*  S  n  <  1 


_3^/ibw/4\< 

8*  V  w/4  /  2  6 

Bohman  Window:  A(u)  =  (1  -  |t,|)coa*w  +  l«n  »|u| 
2t(1  +  coaw) 

2  »/2 

Generator!  Not  of  Truncated  Form 


Daniel!  Window:  A(u)  = 


2?'MS» 


sin  ru 


Bartlett- Priestley  Window:  A(ti)  =  _  co-  Tux 

xu  > 

y  *)•  H  <  *  2  **/10  1,2 


Panen-Cogbum-Davi*  Window:  A(u)  =  —J.  ^  t  ,  -  T/2r 


2r  1  _L_ 


For  r  =  2: 


i,-H^>k(M  +  I)  4  , 


a  2r  —  1 


aina  r 
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U2  f* 


0.734  1.23£ 


128  £ 


119  §* 


166  .45£ 
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Parzen  (M=24)  and  Tukey  (M=18)  Windows 


Parzen  (M=24)  and  Tukey  (M=18)  Windows 


Figure  6.1.  The  Parzen  (solid  curve)  and  Tukey  (x’s)  Spectral  Windows  for  Comparable  Bandwidths. 

a)  The  estimator  is  asymptotically  unbiased . 

b)  Asymptotic  Variance.  If  M  is  chosen  so  that  M  — ►  oo  and  M/n  — ♦>  0  as  n  — ►  oo,  then 

V2(u)  /Too  h3(u)du,  w  =  0,  .5 

#0,.5 


'  .  f2/*(w)r«^(«)*.,  u, 

lim  jtVht(/x,mM)  =  < 

n-«M  « 


lim  T7Cov(/A|Af(wi),  A,a*(w2))  =  0,  u>i  jt  w2. 

n-+  oo  M 


c)  Confidence  intervals  for  f(u)  are  given  by 

Ixm(w)  ±  Za/2 

cf\,M («)), 
c 

where  c  =  exp(Za/2yJ %  X3(u)du),  and 

vfn,k(u)  vfn,k(u)\ 

xll*  ’  X?-«/2  j  ' 

These  intervals  are  for  f  at  a  single  frequency.  A  large  sample  simultaneous  confidence  band  on  f  for  all 
frequencies  can  be  obtained  by  multiplying  both  terms  in  the  second  confidence  interval  by  exp((2  log 


Aa(u)du, 


The  Choice  of  M 


The  choice  of  M  determines  the  amount  of  smoothing  done,  with  too  large  (small)  a  value  resulting  in 
undersmoothing  (oversmoothing).  In  general  the  choice  of  M  is  very  difficult  unless  some  information  about 
the  function  being  estimated  is  known.  Basically,  if  /  has  a  narrow  peak,  we  would  like  the  bandwidth  of  the 
spectral  window  to  be  narrow  so  that  leakage  doesn’t  occur.  This  however  leads  to  undersmoothing  in  other 
frequencies.  Thus  nonparametric  spectral  estimators  have  trouble  “resolving”  peaks  without  introducing 
spurious  peaks  in  other  ranges.  We  will  see  later  that  parametric  spectral  estimators  can  solve  this  problem 
in  many  cases.  In  any  event,  the  prevailing  view  on  the  choice  of  M  is  to  try  more  than  one  value  and  use 
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...  «CT.  0 

Ji’ZZZ  U  "“le  ‘e"er*1  ,Ute™“‘*  *l°“‘  /•  f«  oxemple,  one  could  uce  chc«  M„  M„ 

10<^<.K,  25£^<J5. 


.05  <  —  <  .10, 

91 


«.2.  Parametric  Spectral  Density  Estimation 


stationary  time  aerie*.  We  **  “tunatin*  the  *Peetral  density  of  a  covariance 

“timator,  and  (2)  it  i,  difficult  Tp£L"  *?"*”  !*  “  the  **  to  «■*«  »  £d 

spurious  peaks  as  well.  In  this  section  we  com  ^  *h"P  pcaka  without  introducing  possibly 

*a  approximation  to  that  of  the  process  ’beinc  andttL  w  th®  ,p€ct[*1  deP*>‘y  «*  an  ARMA  model  u 

estimation  procedure  is  easy  to  implement  anAndera  wide  tILrirty tf  c^dr™  f  ,Pcctr&1  density 

to  window  estimators.  m  VBnctjr  of  conditions  leads  to  estimators  superior 

6.2.1.  Autoregressive  Spectral  Estimation 

**  !»“<-•  SSTi toduTtt.' cSL1*  ? !Td«  j  Mdet  «*»*>» 

process.  autoregressive  process  and  an  invertible  MA  or  ARMA 


Theorem  6.2.1  |  Comditiors  for  AR(oo)  Rrpresentation 

®  <  -  f(u)  £  Aj  <  00,  u  e  [o,  1J, 

fcr  mm,  A,  ud  A,.  then  AT  cut  be  .Were  m  „  M,,*  or(fc,  ^ 

All  ARMA  processes  except  those  that  have  a  im  «f  ti 
mcle  h.™  .peetrel  denritie.  tie,  Mtirfy  ‘he  mo>ln«  "«•««  polyuomi J  on  tb,  uuit 

trie  apectnl  deneity  /  of.  ptocen  tbit  re  lx  We  “■■*»  .pprcodoMtinj  the 

a  pth  order  AR  process.  Given  data  X(l)  X(n\  **  !nfimte  or<*er  “t°*«fw*rion  by  that  of 

*teP<:  (1)  determining  the  order  k  of  the  best  skm. ..;*■■■  re^“,,ve  *P«ctral  estimation  consists  cf  three 

**“) . *•<*>  -  St  *■*»  *• »  m« 


/*(w)  = 


*i 


E*»oV'i>w 

U»0 


d“t'ik'<l  ta  *—  <•  Sowem,  Penan'.  CAT 
He  bene  motivation  be  the  eriterioo  i.  eoouiued'  “theTmtt^S!*1  “  **  »  km- 

|  Theorem  6.2.2  |  Properties  or  CAT 
^  the  error  variance  and 

fc(*)®X>r0V 

i»o 
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be  the  transfer  function  of  tie  estimated  AR  process  of  order  p  based  on  a  realization  of  length  n  from 
a  process  which  can  be  written  as  an  infinite  order  autoregressive  process  having  coefficients 
transfer  function 

9oo(*)  =  Ylai*i ' 
i- o 

and  error  variance  e*,.  Let  CAT(p)  be  the  CAT  criterion  for  order  p.  Then 

lim  E(CAT(p))  =  lim  Jp, 

n—oo  n— *oo 

where 

j’=f!L{Wr"k,*‘\)Hu)du- 

This  result  shows  that  the  CAT  criterion  chooses  the  order  that  results  asymptotically  in  the  spectral 
estimator  that  is  closest  (in  the  sense  of  the  integrated  relative  mean  square  error  measured  by  Jp)  to  the 
true  AR(oo)  transfer  function.  This  is  the  origin  of  the  name  “criterion  autoregressive  transfer  function* 
In  the  next  theorem  we  present  the  basic  sampling  properties  of  the  autoregressive  spectral  estimator. 


Theorem  6.2.3 


Properties  of 


AR  Spectral  Estimation 


Let  X  be  a  covariance  stationary  time  series  that  can  be  expressed  as  an  infinite  order  autoregressive 
process 

i=o 


where  the  errors  e  are  independent  and  satisfy  E (X*{t))  <  oo.  Let  fp  be  the  autoregressive  spectral  estimator 
of  the  spectral  density  f  based  on  a  realization  of  length  n.  Then 


a)  If  p  is  chosen  so  that 


i)P 


00, 


and  Hi)  y/n  ^  |«y  |  — ►  0, 
i=p+i 


as  n  — ►  oo,  then  for  any  k  fixed  frequencies  0  <  wi  <  ••■  <  w*  <  0.5,  the  joint  asymptotic  distribution  of 

y/nJkMO)  ~  /( 0)),  s/nfkfp{ui)  -  /(w i)), . . . , 
sRkhW  ~  /(«*)).  V^/Pifpi  0-5)  -  /(0.5)), 
is  that  of  independent,  zero  mean,  normal  random  variables  having  variances 

4/a(0),2/J(w1), . . . ,  2/J  (wt),  4/2(.5). 


b)  IfX  is  a  finite  order  autoregressive  process,  then  letting  p  be  any  /unction  g  of  n  that  satisfies  p  — ►  oo 
and  p3/n  — ►  0  as  n  — ►  oo  satisfies  the  requirements  of  part  (a),  and  the  variance  of  /p(w)  goes  to  zero  at  the 
rate  g{n)/n  as  n  — ►  oo. 
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c)  If  *  is  an  invertible  ARMA  process,  then  p  =  log  n  satisfies  the 
variance  of  J,(u)  goes  to  tero  at  the  rate  of  log  n/n  as  n  -» oo. 


lect. 6 

requirements  o f  part  (a),  and  the 


Izr  ?  ‘r  •'*'  of  •  p-oc-  «b«  <»  b, 

nonnal  u  long  u  tbe  order  tbot  we  rue  is  chr»»  bTth/eorr1  r  “tlm^°r  “  eon*»terit  and  uymptotirafly 
•hows  that  the  choice  depends  on  how  rapidlyV/c^ffi^^  (Ui)  for  the  choice  otP 

process,  the*  coefficients  beco^  ,7k>  ^  °f ' the  AR<°?>  *°  For  a  finite  order  AR 

-  -lowly  as  we  want.  For  an^er^UMAor  uSufSET*'  “4  V  emn1d,00<e  *  to  go  to  infinity 

AR(oo)  representation  go  to  *ero  at  a  rate  bounded  bv  fn  ’  C°uW  *how  that  the  co<fficienU  of  the 
is  sufficient.  »«o  at  a  rate  bounded  by  p*  for  some  constant  p  and  thus  choosing  p  =  log  n 


The  next  theorem  (due  to  Newton  and  Pagano  (1984))  gives  confidence  band. 


for  AR(p)  spectra. 


Theorem  6.2.4 


Confidence  Bands  for  AR  Spectra 


*■*  /  “  >  -  •  •*-  *  •»  M 

M")+.HS/(“)S  "  e  I°-  *1- 


where 


=  lb)  ~  +  2Hi(v) co* 2tww 

*P\  /  »cl 


W'jrS4.  «)*.(/+•).  .  =  0 . . 

p  i=o 


«’(“)  =  %ixTH6«(u) 

xT(«)  =  (1, 2  cos  2tw,  . . . ,  2  cos  7rpu) 

D  =  BCBt 


and 


_fS(*)  Vl 

l<£  2/w«J 

*=p+ 1,  ;  =  1 . p+i 

l  *-’(«*+, -i  +  *4-i+1),  h=l . .  i  =  1 . p  +  1 
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Lynx  Data  Divided  by  1000 


Figure  6.2.  The  Lynx  and  Sunspot  Data. 


6.2.2.  MA  and  ARMA  Spectral  Estimation 

The  next  natural  step  in  spectral  estimation  is  to  use  MA  and  ARMA  models  to  obtain  estimators. 
We  note  that  MA  spectral  estimation  is  very  similar  to  nonparametric  spectral  estimation  in  that  it  results 
in  an  estimator  that  is  a  finite-degree  trigonometric  polynomial.  Thus  it  suffers  from  the  same  difficulty  in 
estimating  spectra  that  have  sharp  peaks.  It  does  have  the  advantage  of  having  the  AIC  available  to  aid  in 
choosing  the  degree  of  the  polynomial. 

The  use  of  ARMA  models  to  estimate  Bpectra  does  not  have  the  weakness  that  nonparametric  and  mov¬ 
ing  average  methods  have  and  much  attention  has  been  paid  recently  to  ARMA  spectral  estimation  However, 
unless  the  true  spectral  density  has  both  sharp  peaks  and  troughs,  autoregressive  spectral  estimation,  with 
its  computational  simplicity,  should  be  adequate. 

6.3.  Methods  for  Determining  Periodicities 

Several  of  the  most  famous  time  aeries  that  have  been  studied  in  the  past  have  appeared  to  contain 
cyclical  components.  For  example,  in  Figure  6.2,  we  display  two  data  sets:  (1)  the  annual  number  of  Canadian 
lynx  trapped  on  the  Mackenzie  River  for  the  yean  1821-1934,  and  (2)  the  annual  average  value  of  the  daily 
index  of  the  number  of  sunspots  (using  the  scale  devised  by  Professor  Rudolf  Wolf  in  1849)  for  the  years 
1755-1964.  These  data  sets  have  been  extensively  studied  over  the  past  several  years  (see  for  example  Part 
4  of  the  1977  volume  of  the  Journal  of  the  Royal  Statistical  Society,  Series  A).  The  basic  property  of  these 
data  is  that  there  appear  to  be  cyclic  patterns  but  that  these  patterns  are  not  perfectly  cyclic.  The  “sunspot 
cycle”  is  a  well  known  phenomenon  and  has  an  important  effect  on  radio  communications.  The  cycle  in  the 
lynx  data  is  usually  explained  as  being  due  to  a  predator-prey  relationship  between  the  Canadian  lynx  and 
the  snowshoe  hare,  its  most  important  source  of  food. 

Time  series  such  as  the  lynx  and  sunspot  data  are  traditionally  analyzed  according  to  one  of  three 
models: 
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1.  A*  the  sum  of  a  deterministic  sinusoid  plus  an  error  that  is  a  covariance  stationary  time  series,  that  is, 

X(t)  =  /t  +  a  cos(^)  +  fcsin(— )  +  <(<), 

P  P 

where  the  error  series  t  is  covariance  stationary,  and  p  is  the  period  of  the  sinusoid. 

2.  A*  the  values  of  a  stochastic  process  referred  to  by  names  such  as  an  ‘outburst’  model  or  a  filtered 
Poisson  process.  The  basic  idea  of  this  type  of  model  is  that  a  physical  process  builds  up  and  then  has 

*  ,*,VUr.®e  °r  outbur,t>  followed  by  a  decay  back  to  some  baseline  value.  We  will  not  consider  this 

model  further. 

3.  As  the  values  of  a  covariance  stationary  time  series  of  near  cyclic  type.  For  example,  the  AR(2)  model 

*(t)  +  -  1)  +  a2X(t  -  2)  =  e(f) 

will  appear  roughly  cyclic  with  period  pifcrj  =  -2cos(2 t/p)  and  a2  is  close  to  1. 


6.3.1.  Deterministic  Sinusoid  Plus  Error 

If  we  believe  that  a  time  series  is  actually  a  deterministic  sinusoid  observed  with  additive  error  and 
we  know  the  period  of  the  sinusoid,  then  we  can  use  regression  analysis  to  estimate  the  coefficients  of  the 
sinusoid  and  test  whether  the  amplitude  is  in  fact  zero.  If  the  errors  are  uncorrelated,  then  ordinary  least 
squares  can  be  used  while  if  the  errors  are  correlated,  we  can  use  the  procedure  described  in  Section  5.2  for 
doing  regression  with  autoregressive  errors.  In  fact,  in  Section  5.2  we  used  the  lynx  data  to  illustrate  the 
procedure. 

Estimating  a  Period 

If  one  has  data  X(l)9  - . .  t  A'(n)  and  suspects  that  they  contain  a  deterministic  sinusoid  of  some  unknown 
frequency  plus  noise,  then  this  problem  is  often  referred  to  as  a  search  for  hidden  periodicities.  It  seems 
natural  to  inspect  the  periodogram  of  the  data  and  test  whether  the  largest  value  of  the  periodogram  is 
significantly  different  from  zero.  If  not,  then  we  would  conclude  that  there  are  no  deterministic  sinusoids  of 
any  of  the  natural  frequencies  in  the  data.  If  the  largest  value  is  significantly  different  from  aero,  then  we 
must  ask  whether  it  could  happen  that  there  is  a  sinusoid  but  it  is  actually  at  a  different  frequency.  To  do 
this  test,  we  can  use  the  results  of  the  next  theorem.  In  this  theorem  we  assume  that  n  is  odd  The  results 
are  approximately  correct  if  n  is  even. 


Theorem  6.3.1 


Fisher’s  Exact  Test 


Let  f(uj)  he  the  periodogram  of a  realization  of  length  n  (n  odd)  tom  the  process 

X(t)  =  a  cos  — ^  +  6 sin  ~  ^  +  e(f), 

P  P 

where  the  period  p  k  a  factor  of  n,  a ad  e  is  a  Gaussian  white  noise  series.  Let  m  =  [n/2].  Then 
*)  If  a  cr  t 35  0,  tbe  exhct  distribution  of 


1  ~  i»/i] 

EM) 

i- J 
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Figure  6.3.  Log  (base  10)  of  the  Lynx  Data  and  the  Estimated  AR(ll)  Spectral  Density. 
is  given  by  > 

Pt (9  >Z)  =  f>l r 1  (7)(1  -  z  >  0, 

where  K  =  [1/*]. 

b)  The  chance  that  the  test  given  in  part  (a)  will  choose  a  frequency  other  than  l/p  is  less  than  the 
level  of  significance  a  that  is  used  in  the  test. 


The  test  in  part  (a)  is  called  Fisher’s  exact  test.  For  the  log  of  the  lynx  data,  we  find  that  g  =  0.5967 
which  has  a  p-value  less  than  10-4,  and  the  maximum  is  at  period  114/12  =  9.5.  If  one  suspects  the  existence 
of  r  sinusoids,  then  the  above  test  can  be  modified. 

A  crucial  assumption  in  the  Fisher  test  is  that  the  errors  are  white  noise.  If  they  are  not,  that  is, 
the  spectral  density  of  the  error  time  series  is  not  a  constant,  then  a  large  value  of  the  periodogram  at  a 
particular  frequency  could  be  due  to  either  (1)  the  existence  of  a  sinusoid  having  that  frequency,  (2)  the 
spectral  density  being  large  at  that  frequency,  or  (3)  a  combination  of  the  two.  This  is  referred  to  as  the 
mixed  spectrum  case  (see  Section  6.3  of  Priestley  (1981)),  and  unless  one  has  some  prior  knowledge  of  the 
nature  of  /  or  the  sinusoids  involved,  separating  the  two  parts  of  the  model  is  very  difficult. 

6.3.2.  Estimating  Peak  Frequencies  in  AR  Spectra 

For  a  process  of  near  cyclic  type,  the  data  appear  to  be  cyclic  except  that  the  lengths  of  cycles  vary  from 
one  cycle  to  the  next.  Thus  such  a  model  is  often  referred  to  as  a  disturbed  periodicity  model,  and  the  analog 
of  determining  the  period  of  a  deterministic  sinusoid  is  to  determine  the  frequency  u>*  where  the  spectral 
density  of  the  process  has  a  peak.  Searching  for  peak  frequencies  is  not  difficult  if  the  process  is  a  finite  order 
MA  or  AR  process  as  then  /  (or  its  reciprocal  in  the  AR  case)  is  a  finite-degree  trigonometric  polynomial  and 
finding  the  critical  values  (maxima  and  minima)  of  such  a  polynomial  is  not  difficult.  Given  a  realization  of 
length  n  from  an  AR(p)  process,  we  can  estimate  the  peak  frequency  w*  by  finding  the  frequency  where  the 
reciprocal  of  f  has  a  relative  minimum.  We  will  denote  this  estimator  by  w .  If  the  order  p  is  unknown,  then 
we  can  estimate  it  and  the  coefficients  of  the  estimated  order  process  and  again  use  the  process  described 
above  to  estimate  the  peak  frequency.  This  estimator  is  denoted  by  uf.  The  properties  of  these  estimators 
are  given  in  the  next  theorem. 
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Theorem  6.3.2 


Estimating  a  Peak 


Frequency 


Let  u  and  u,  be  the  Autoregressive  spectrAl  estimates  described  Above  of  a  peak  frequency  u*  based  on 
*  realization  of  size  n  from  t  Gaussjad  AR(py a,  process .  Then 

a)  x/n(w  —  w*)  M(0,e3(u>*)),  where 


,,,.)  _  br(u>*)C(o)S(ni)CT(o)bffa»*) 

(h"(W)/2x)a 

where  S(«)  is  the  Schur  matrix  corresponding  to  a, 

bT(w)  =  (sin2*w,2sin4irw . psin2xpw), 

h(w)  =  <rV/(w),  »nd  C(a)  is  the  p  x  p  matrix  baring  (j,  k)tb  element 

=  Oj+t  + 

with  a9  =  0  if  v  >  p  or  v  <  0. 

b)  If  a  consistent  order-determining  criterion  is  used  for  finding  p,  then  the  results  of  part  (a)  continue 

to  hold  for  Qf .  '  7 

c)  If  an  order-determining  criterion  that  is  guaranteed  asymptotically  to  not  underestimate  the  order  is 
used  to  find  p,  then  up  is  a  consistent  estimator  ofum. 


Ftom  these  results,  we  can  find  a  large  sample  confidence  interval  for  w*  as 

v  ±  Z0/5<r(w), 

while  the  lower  and  upper  limits  in  a  confidence  interval  for  the  reciprocal  of  u>\  that  is,  for  the  period  of 
the  peak,  are  given  by  the  reciprocals  of  the  upper  and  lower  limits  for  the  frequency.  Notice  the  important 
role  played  by  the  second  derivative  of  h  in  the  asymptotic  variance  of  u.  For  a  sharp  peak,  h"  wffl  be  large 
and  thus  the  confidence  interval  will  be  narrow.  For  a  broad  peak,  the  interval  will  be  wide. 

A  aucialpart  of  the  method  of  this  section  is  the  fact  that  we  have  been  finding  a  aero  of  a  finite-degree 
polynomial.  Thus  if  /  is  of  the  form  of  an  ARMA  spectral  density,  that  is,  as  the  ratio  of  two  finite-degree 
polynomials,  we  cannot  apply  the  above  procedure.  However,  as  long  as  /  can  be  expressed  as  the  spectral 
density  of  either  an  AR(oo)  or  MA(oo)  process,  we  should  be  able  to  apply  the  above  procedure  and  obtain 
asymptotically  good  properties. 

To  illustrate  the  use  of  this  procedure,  we  consider  the  lynx  data  again.  While  there  is  no  clear  agreement 
among  analysts  what  AR  process  best  models  the  bg  of  the  lynx  data,  we  consider  the  result  of  using  the 
order  11  process  determined  in  Section  5.2.  In  Figure  6.3  we  give  plots  of  the  bg  (base  10)  of  the  data  and 
the  estimated  AR(11)  spectral  density.  The  confidence  interval  for  the  period  corresponding  to  the  largest 
peak  in  the  spectral  density  is  (9.32, 10.05)  years.  *  * 
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